On Sun, Dec 6, 2015 at 6:36 AM, Michał Górny <mgorny@gentoo.org> wrote:
Hello,

Hi!
 

As you have seen multiple times, I'm running a minimalistic CI service
for Gentoo that checks the repository for major issues using pkgcheck.
So far it's automation is limited to sending a mail to dedicated
gentoo-automated-testing@lists.gentoo.org mailing list on breakage
changes. From there, I compare the results to recent git log and mail
the developers at fault, pointing out the bad commit.

A few developers have already subscribed to the mailing list to check
if they haven't caused any new breakages and fix them quickly. For
others, it's pretty much just me caring to check, which also means that
when I'm not around things are left broken.


So this sort of brings up a point of responsibility.

 
Automating the blaming process has been suggested multiple times
already but I so far considered it not worth the effort. Mostly because
many of the issues are indirect, and trying to automatically figure
them out from combination of the pkgcheck report and recent commits
would be hard, and could cause false positives. For example, some of
the depgraph breakages happen because of package.mask changes --
figuring that out automatically wouldn't be easy, and the script could
blame an irrelevant commit in the package.

However, it was suggested recently that I could make it mail
the maintainers of the affected packages. Even though most often it's
not them who are at fault, it was suggested that they'd prefer to know
that their packages are broken.

I think there are a few issues:

1) Not everyone cares. I think you can either go for an opt-in approach (hard..you need to keep state) or offer clear opt-out / filtering instructions (link in the bottom of the email that points at the opt-out instructions on wiki.) Either decision will piss people off; I wouldn't fret it as long as you pick one.

2) Unclear ownership of the problem. One guy makes a commit, 100 packages break. Who is responsible? Its really murky. This is really the toughest problem to me.

3) Problems are not stateless (e.g. many are transient as they are fixed later by developers.) Is the email I got 8 hours ago still relevant? What we normally see in items like this is a framework to manage "incidents". So what you might see is an incident App. The CI infrastructure detects a problem and opens an incident. At incident open, you trigger a notification (said email). Typically incidents can be claimed (a human takes ownership and fixes the incident) or perhaps a future run of the automation detects that the incident is fixed and closes the incident.

The problem of course with 3 is that you are very much re-inventing a bunch of functionality that is already in bugzilla; which leads to the argument of 'why not open bugs for breakages' ;)

-A
 

So what do you think? Would it be fine to mail the package maintainers
whenever their packages break? Would it be a problem if I just CC-ed
all the maintainers on the gentoo-automated-testing mails? Please note
that the breakages are catched per-package, and the script wouldn't be
able to respect restrict="" or hand-written maintainer descriptions ;-).

--
Best regards,
Michał Górny
<http://dev.gentoo.org/~mgorny/>