public inbox for gentoo-dev@lists.gentoo.org
 help / color / mirror / Atom feed
* [gentoo-dev] Killing herds, again
@ 2019-04-03 17:35 Michał Górny
  2019-04-03 22:52 ` Alec Warner
  2019-04-14 12:57 ` Mart Raudsepp
  0 siblings, 2 replies; 4+ messages in thread
From: Michał Górny @ 2019-04-03 17:35 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 10318 bytes --]

Hello, everyone.

Back in 2016, we've killed the technical representation of herds.  Some
of them were disbanded completely, others merged with existing projects
or converted into new projects.  This solved some of the problems with
maintainer declarations but it didn't solve the most important problem
herds posed.  Sadly, it seems that the spirit of herds survived along
with those problems.

Herds served as a method of grouping packages by a common topic,
somewhat similar (but usually more broadly) than categories.  In their
mature state, herds had either their specific maintainers, or were
directly connected to projects (which in turn provided maintainers for
the herds).  Today, we still have many herds that are masked either
as complete projects, or semi-projects (i.e. project entries without
explicit lead, policies or anything else).


What's wrong with herds?
------------------------
The main problem with herds is that they represent an artificial
relation between packages.  The only common thing about them is topic,
and there is no real reason why a group of people would maintain all
packages regarding the same topic.  In fact, it is absurd -- say, why
would a single person maintain 10+ competing cron implementations? 
Surely, there is some common knowledge related to running cron,
and it is entirely possible that a single person would use a few
different cron implementations on different systems.  But that doesn't
justify creating an artificial project to maintain all cron
implementations.

Mapping this to reality, projects usually represent a few developers,
each of them interested in a specific subset of packages maintained by
the project.  In some cases, this is explicitly noted as project member
roles; in other, it is not stated clearly anywhere.  In both cases,
there is usually some group of packages that are assigned to
the specific project but not maintained by any of the project members.

Less structured projects often have problems tracking member activity. 
More than once a project effectively died when all members became
inactive, yet effectively hid the fact that the relevant packages were
unmaintained and sometimes discouraged more timid developers from fixing
bugs.


What kind of projects make sense?
---------------------------------
If we are to fight herd-like projects, I think it is important to
consider a bit what kind of projects make sense, and what form herd-like 
trouble.

The two projects maintaining the largest number of packages in Gentoo
are respectively the Perl project and the Python project.  Strictly
speaking, both could be considered herd-like -- after all, they maintain
a lot of packages belonging to the same category.  To some degree, this
is true.  However, I believe those make sense because:

a. They maintain a central group of packages, eclasses, policies etc.
related to writing ebuilds using the specific programming language,
and help other developers with it.  The existence of such a project is
really useful.

b. The packages maintained by them have many common properties,
frequently come from common sources (CPAN, pypi) and that makes it
possible for a large number of developers to actually maintain all
of them.

The Python project I know better, so I'll add something.  It does not
accept all Python packages (although some developers insist on adding us
to them without asking), and especially not random programs written in
the Python language.  It specifically focuses on Python module packages,
i.e. resources generally useful to Python programmers.  This is what
makes it different from a common herd project.

The third biggest project in Gentoo is -- in my opinion -- a perfect
example of a problematic herd-project.  The games project maintains
a total of 877 packages, and sad to say many are in a really bad shape. 
Even if we presumed all developers were active, this gives us 175
packages per person, and I seriously doubt one person can actively
maintain that many programs.  Add to that the fact that many of them are
proprietary and fetch-restricted, and only the people possessing a copy
can maintain it, and you see how blurry the package mapping is.

Let's look at the next projects on the list.  Proxy-maint is very
specific as it proxies contributors; however, it is technically valid
since all project members can (and should) actively proxy for any
maintainers we have.  Though I have to admit the number of maintained
packages simply overburdens us.

Haskell, Java, Ruby are other examples of projects focused on
programming languages.  KDE and GNOME projects generally make sense
since packages maintained by those projects have many common features,
and the core set has common upstream and sometimes synced releases.  It
is reasonable to assume members of those projects will maintain all, or
at least majority of those packages.

The next project is Sound -- and in my experience, it involves a lot of
poorly maintained or unmaintained packages.  Again, the problem is that
the packages maintained by the project have little in common -- why
would any single person maintain a dozen audio players, converters,
libraries, etc.  Having multiple people in project may increase
the chance that they would happen to cover a larger set of competing
packages but that's really more incidental than expected.

This is basically how I'd summarize a difference between a valid
project, and a herd-project.  A valid project maintains packages that
have many common properties, where it really makes sense for
an arbitrarily chosen project member to take care of an arbitrary chosen
package maintained by the project.  A herd-project maintains packages
that have only common topic, and usually means that an arbitrarily
chosen project member maintains only a small subset of all packages
maintained by the project.

Looking further through the list, projects that seem to make sense
include ROS, Emacs, maybe base-system, SELinux, ML, X11 (after all, it
maintains core Xorg and nobody sets them as 'backup' maintainers for
random X11 programs), PHP, vim...

Project that are herd-like include science (possibly with all its
flavors), netmon, video, desktop-misc (this is a very example of 'random
programs'), graphics...


What do I propose?
------------------
I'd like to propose either disbanding herd-like projects entirely, or
transforming them into more proper projects.  Not only those that are
clearly dysfunctional but also those that incidentally happen to work
(e.g. because they maintain a few packages, or because they represent
a single developer with wide interest).

More specifically, I'd like each of the affected projects to choose
between:

a. disbanding the project entirely and finding individual maintainers
for all packages,

b. reducing the packages maintained by the project to a well-defined
'core set' whose maintenance by a group of developers makes sense,
and finding individual maintainers for the remaining packages,

c. splitting one or more smaller projects with well-defined scope from
the project, and doing a. or b. for the remaining packages.

Let's take a few examples.  For a start, cron project.  Previously, it
maintained a number of different cron implementations (most having their
individual maintainers by now), a cronbase package and cron.eclass.
In this context, option a. means disbanding the project entirely.  Some
packages already have maintainers, others go maintainer-needed.

Option b. would most likely involve leaving a cron project as small
entity to provide policies for consistent cron handling, and maintain
cronbase package and cron.eclass.  Different cron implementation would
go to individual maintainers anyway.

A similar example can be made for the PAM project that maintained
pambase, Linux-PAM, pam.eclass and some PAM modules.  Here a. means
giving all packages away, and b. means leaving a minimal project that
maintains policies, pambase, Linux-PAM and the eclass.  The individual
modules (except for maybe very common, if there were some) would find
individual maintainers.

A good example for the c. option is the recently revived VoIP project. 
Again, this is an example of herd-project that tries to maintain
an arbitrary set of loosely related packages.  To some, it might make
sense, especially since there's only a few VoIP packages left in Gentoo.
Nevertheless, there is no reason why a single project member would
maintain multiple competing VoIP stacks.

Here, the c. option would mean creating project(s) for specific stacks
of interest.  For example, if there was specific project-level interest
for maintaining Asterisk packages, an Asterisk project would make more
sense than generic 'VoIP'.


Why, again?
-----------
As I said before, the main problem with herds is that they introduce
artificial and non-transparent relation between packages and package
maintainers.

Firstly, they usually tend to include packages that none of the project
members is actually interested in maintaining.  This also includes
packages added by other developers (let's shove it in here, it matches
their job description!) or packages leftover from other developers
(where the project was backup maintainer).  This means having a lot of
packages that seem to have a maintainer but actually don't.

Secondly, they frequently lack proper structure and handling of leaving
members.  Therefore, whenever a member maintaining a specific set of
packages leaves, it is possible that the number of not-really-maintained 
packages increases.

Thirdly, they tend to degenerate and become defunct (much more than
projects that make sense).  Then, the number of not-really-maintained
packages ends up being really high.

My goal here is to make sure that we have clear and correct information
about package maintainers.  Most notable, if a package has no active
maintainer, we really need to have 'up for grabs' issued and package
marked as maintainer-needed, rather than hidden behind some project
whose members may not even be aware of the fact that they're its
maintainers.


What do you think?

-- 
Best regards,
Michał Górny


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 963 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-04-14 12:58 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-04-03 17:35 [gentoo-dev] Killing herds, again Michał Górny
2019-04-03 22:52 ` Alec Warner
2019-04-04 13:20   ` Michał Górny
2019-04-14 12:57 ` Mart Raudsepp

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox