public inbox for gentoo-project@lists.gentoo.org
 help / color / mirror / Atom feed
From: Kent Fredric <kentnl@gentoo.org>
To: gentoo-project@lists.gentoo.org
Subject: Re: [gentoo-project] Call for agenda items - Council meeting 2016-08-14
Date: Tue, 9 Aug 2016 17:32:55 +1200	[thread overview]
Message-ID: <20160809173255.0ddfa090@katipo2.lan> (raw)
In-Reply-To: <febe98a2-e8c9-06c8-aa17-3fbac2788364@gentoo.org>

[-- Attachment #1: Type: text/plain, Size: 9553 bytes --]

On Mon, 8 Aug 2016 19:07:04 -0700
Jack Morgan <jmorgan@gentoo.org> wrote:

> On 08/08/16 05:35, Marek Szuba wrote:
> > 
> > Bottom line: I would say we do need some way of streamlining ebuild
> > stabilisation.  
> 
> I vote we fix this problem. I'm tired of having this same discussion
> ever 6 or 12 months. I'd like to see less policy discussion and more
> technical solutions to the problems we face.
> 
> I propose calling for volunteers to create a new project that works on
> solving our stabilization problem. I see that looking like the
> following:
> 
> 1) project identifies the problem(s) with real data from Bugzilla and
> the portage tree.
> 
> 2) new project defines a technical proposal to fixing this issue, then
> presents it to the developer community for feedback. This would
> include defining tools needed or used
> 
> 3) start working on solution + define future roadmap
> 
> 
> All processes and policies should be on the table for negotiating in
> the potential solution. If we need to reinvent the wheel, then let's
> do it.
> 
> To be honest, adding more policy just ends up making everyone unhappy
> one way or the other.
> 
> 

There's a potential way to garner a technical solution that somewhat
alleviates the need for such rigourous arch testers, and without
degrading the stabilisation mechanic to "blind monkey system that
stabilises based on conjecture".

I've mentioned it before ages ago on the Gentoo Dev list, somewhere.

The idea is basically to instrument portage to have an (optional)
feature that when turned on, records and submits certain facts about
every failed or successful install, with the objective being to
essentially spread the load out of what `tatt` does organically over
the participant base.

1. Firstly, make no demands of homoegenity or even sanity for a users
system to participate. Ever thing they throw at this system I'm about
to propose should be considered "valid"

2. Every time a package is installed, or attempted to be installed, the
exit of that installation is qualified in one of a number of ways:

   - installed OK without tests
   - installed OK with tests
   - failed tests
   - failed install
   - failed compile 
   - failed configure

Each of these is a single state in a single field.

3. The Name, Version, and SHA1 of the ebuild that generated the report.


4. The USE flags and any other pertinent ( and carefully selected by
Gentoo ) flags are included, each as single fields in a property set,
and decomposed into structured property lists where possible.

5. <arch> satisfaction data for the target package at the time of
installation is recorded.

eg:

   KEYWORDS="arch"  + ACCEPT_KEYWORDS="~arch" -> [ "arch(~)"  ]
   KEYWORDS="~arch" + ACCEPT_KEYWORDS="~arch" -> [ "~arch(~)" ]
   KEYWORDS="arch"  + ACCEPT_KEYWORDS="arch"  -> [ "arch"     ]
   KEYWORDS=""      + ACCEPT_KEYWORDS="**"    -> [ "(**)"     ]

This seems redundant, but this is basically suggesting "hey, if you're
insane and setting lots of different arches for accept keywords, that
would be relevant data to use to ignore your report. This data can also
be used with other data I'll mention later to isolate users with "mixed
keywording" setups.

6. For every dependency listed in *DEPEND, a dictionary/hash of

  "specified atom" -> {
     name -> resolved dependency name
     version -> version of resolved dependency
     arch -> [ satisfied arch spec as in #4 ]
     sha1 -> Some kind of SHA1 that hopefully turns up in gentoo.git
  }


is recorded in the response at the time of the result.

The "satisified arch spec" field is used to isolate anomalies in
keywording and user keyword mixing and filter out non-target reports
for stabilization data.

7. A Submitter Unique Identifier

8. Possibly a Submitter-Machine Unique Identifier.

9. The whole build log will be included compressed, verbatim.

This latter part will an independent option to the "reporting" feature,
because its a slightly more invasive privacy concern than the others,
in that, arbitrary code execution can leak private data.

Hence, people who turn this feature on have to know what they're
signing up for.

10. All of the above data is pooled and shipped as a single report, and
submitted to a "report server" and aggregated.


With all of the above, in the most native of situations, we can use
that data at very least to give us a lot more assurance than "well, 30
days passed, and nobody complained", because we'll have a paper trail
of a known countable number of successful installs, which while
not representative, are likely to still be more diverse and
reassuring of confidence than the deafening silence of no
feedback.

And in non-naive situations, the results for given versions can be
aggregated and compared, and factors that are present can be correlated
with failures statistically.

And this would give us a status board of "here's a bunch of
configurations that seem to be statisically more problematic than
others, might be worth investigating"

But there would be no burden to actually dive into the logs unless you
found clusters of failures from different sources failing under the
same scenarios ( And this is why not everyone *has* to send build logs
to be effective, just enough people have to report "x configuration
bad" and some subset of them have to provide elucidating logs ).

None of what I mention here is conceptually "new", I've just
re-explained the entire CPAN Testers model in terms relevant to Gentoo,
using Gentoo parts instead of CPAN parts.

And CPAN testers find it *very effective* at being assured they didn't
break anything: They ship a TRIAL release ( akin to our ~arch ), and
then wait a week or so while people download and test it.

And pretty much anyone can become "a tester", there's no barrier to
entry, and no requirements for membership. Just install the tools, get
yourself an ID, and start installing stuff with tests (the default),
and the tools you have will automatically fire off those reports to the
hive, and you get a big pretty matrix of "We're good here", and then
after no red results in some period, they go "hey, yep, we're good" and
ship a stable release.

Or maybe occasional pockets of "you dun goofed" where there will be a
problem you might have to look into ( sometimes those problems are
entirely invalid problems, ... this is somehow typically not an issue )

http://matrix.cpantesters.org/?dist=App-perlbrew+0.76

And you throw variants analysis into the mix and you get those other
facts compared and ranked by "Likelihood to be part of the problem"

http://analysis.cpantesters.org/solved?distv=App-perlbrew-0.76

^ you see here variant analysis found 3 common strings in the logs that
indicated a failure, and it pointed the finger directly at the failing
test as a result. And then in rank #3, you see its pointing a finger at
CPAN::Perl::Releases as "a possible problem highly correlated with
failures" with the -0.5 theta on version 2.88 

Lo and behold, automated differential analysis has found the bug: 

https://rt.cpan.org/Ticket/Display.html?id=116517

It still takes a human to 

a) decide to look
b) decide the differential factors are useful enough to pursue 
c) verify the problem manually by using the guidance given
d) manually file the bug

But the point here is we can actually build some infrastructure that
will give automated tooling some degree of assurance that "this can
probably be safely stabilized now, the testers aren't seeing any issues"

Its just also the sort of data collection that can lend itself to much
more powerful benefits as well.

The only hard parts are:

1. Making a good server to handle these reports that scales well
2. Making a good client for report generation, collection from PORTAGE
and submission
3. Getting people to turn on the feature
4. Getting enough people using the feature that the majority of the
"easy" stabilizations can happen hands-free. 

And we don't even have to do the "Fancy" parts of it now:

 Just pools of  "package:  arch = 100pass/0fail archb = 10pass/0  fail" 

Would be a great start.

Because otherwise we're relying 100% on negative feedback, and assuming
that the absence of negative feedback is positive, when the reality
might be closer that the absence of negative feedback is that the
problems were too confusing to report as an explicit bug, the problems
faced were deemed unimportant to the person in question and they gave
up before they reported it, the user encountered some other entry
barrier in reporting, ..... or maybe, nobody is actually using the
package at all, so it could actually be completely broken and nobody
notices.

And it seems entirely hap-hazard to encourage tooling that not
*builds* upon that assumption.

At least with the manual stabilization process, you can be assured that
at least one human will personally install, test, and verify a package
works in at least one situation.

With a completely automated stabilization that relies on the absence of
negative feedback to stabilize, you're *not even getting that*.

Why bother with stabilization at all if the entire thing is merely
*conjecture* ?

Even a broken, flawed stabilization workflow done by teams of people
who are bad at testing is better than a stabilization workflow
implemented on conjecture of stability :P






[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

  reply	other threads:[~2016-08-09  5:33 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-04 14:15 [gentoo-project] Call for agenda items - Council meeting 2016-08-14 Kristian Fiskerstrand
2016-08-04 16:24 ` William Hubbs
2016-08-04 17:08   ` Dirkjan Ochtman
2016-08-04 17:09   ` Brian Dolbec
2016-08-04 18:31     ` William Hubbs
2016-08-04 20:12   ` Andrew Savchenko
2016-08-04 22:22     ` William Hubbs
2016-08-04 23:25       ` Rich Freeman
2016-08-05  2:26         ` William Hubbs
2016-08-05 10:57           ` Rich Freeman
2016-08-05 14:28             ` William Hubbs
2016-08-05 14:36               ` Rich Freeman
2016-08-05 15:36                 ` William Hubbs
2016-08-08 12:35                   ` Marek Szuba
2016-08-08 19:51                     ` Pacho Ramos
2016-08-09  2:07                     ` Jack Morgan
2016-08-09  5:32                       ` Kent Fredric [this message]
2016-08-09  5:59                         ` Rich Freeman
2016-08-09 10:05                           ` Kent Fredric
2016-08-09 14:41                             ` Brian Dolbec
2016-08-09 15:12                               ` Kent Fredric
2016-08-09 16:15                                 ` Brian Dolbec
2016-08-09 17:09                                   ` Kent Fredric
2016-08-09 17:12                                     ` Brian Evans
2016-08-09 17:18                                       ` Chí-Thanh Christopher Nguyễn
2016-08-09 17:22                                     ` Ciaran McCreesh
2016-08-09 20:08                                       ` Rich Freeman
2016-08-09 20:14                                         ` Kristian Fiskerstrand
2016-08-09 20:20                                         ` Kristian Fiskerstrand
2016-08-10  1:15                               ` Pallav Agarwal
2016-08-10  1:28                                 ` Brian Dolbec
2016-08-05 17:32         ` Kristian Fiskerstrand
2016-08-05 17:29     ` Kristian Fiskerstrand
2016-08-04 21:30   ` Daniel Campbell
2016-08-05 16:11   ` Michael Orlitzky
2016-08-05 16:22     ` [gentoo-project] " Michael Palimaka
2016-08-05 17:06       ` Michael Orlitzky
2016-08-05 17:11         ` Chí-Thanh Christopher Nguyễn
2016-08-05 17:38           ` Michael Orlitzky
2016-08-05 17:19         ` Michał Górny
2016-08-05 17:21           ` Michael Orlitzky
2016-08-05 17:31   ` [gentoo-project] " Kristian Fiskerstrand
2016-08-05 18:42     ` William Hubbs
2016-08-05 18:45       ` Kristian Fiskerstrand
2016-08-05 18:55         ` NP-Hardass
2016-08-05 19:03           ` Kristian Fiskerstrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160809173255.0ddfa090@katipo2.lan \
    --to=kentnl@gentoo.org \
    --cc=gentoo-project@lists.gentoo.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox