From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by finch.gentoo.org (Postfix) with ESMTPS id 209C913832E for ; Tue, 9 Aug 2016 05:33:39 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 8C836E0B57; Tue, 9 Aug 2016 05:33:35 +0000 (UTC) Received: from smtp.gentoo.org (smtp.gentoo.org [140.211.166.183]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id ACA3FE0B4F for ; Tue, 9 Aug 2016 05:33:34 +0000 (UTC) Received: from katipo2.lan (unknown [IPv6:2406:e001:1:d01:c2f8:daff:fe83:ed01]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) (Authenticated sender: kentnl) by smtp.gentoo.org (Postfix) with ESMTPSA id 97B793408EC for ; Tue, 9 Aug 2016 05:33:32 +0000 (UTC) Date: Tue, 9 Aug 2016 17:32:55 +1200 From: Kent Fredric To: gentoo-project@lists.gentoo.org Subject: Re: [gentoo-project] Call for agenda items - Council meeting 2016-08-14 Message-ID: <20160809173255.0ddfa090@katipo2.lan> In-Reply-To: References: <2e11e445-c25b-b7f2-def1-99aed92308b6@gentoo.org> <20160804162443.GA7048@whubbs1.gaikai.biz> <20160804231224.7b7462168f1d23e88fe4135c@gentoo.org> <20160804222234.GA8357@whubbs1.gaikai.biz> <20160805022658.GA15727@linux1> <20160805142859.GA19008@linux1> <20160805153658.GA11058@whubbs1.gaikai.biz> <52993bd4-afc9-197e-acda-96db413e6608@gentoo.org> Organization: Gentoo X-Mailer: Claws Mail 3.13.2 (GTK+ 2.24.30; x86_64-pc-linux-gnu) Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Project discussion list X-BeenThere: gentoo-project@lists.gentoo.org Reply-To: gentoo-project@lists.gentoo.org MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; boundary="Sig_/sjzgruXUYQ+WsgeOwKN4zc8"; protocol="application/pgp-signature" X-Archives-Salt: 35accb0e-a812-49f7-9b0e-e386ad122bea X-Archives-Hash: 9fcc546c54f0d2ead1afea44b664f6c6 --Sig_/sjzgruXUYQ+WsgeOwKN4zc8 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable On Mon, 8 Aug 2016 19:07:04 -0700 Jack Morgan wrote: > On 08/08/16 05:35, Marek Szuba wrote: > >=20 > > Bottom line: I would say we do need some way of streamlining ebuild > > stabilisation. =20 >=20 > I vote we fix this problem. I'm tired of having this same discussion > ever 6 or 12 months. I'd like to see less policy discussion and more > technical solutions to the problems we face. >=20 > I propose calling for volunteers to create a new project that works on > solving our stabilization problem. I see that looking like the > following: >=20 > 1) project identifies the problem(s) with real data from Bugzilla and > the portage tree. >=20 > 2) new project defines a technical proposal to fixing this issue, then > presents it to the developer community for feedback. This would > include defining tools needed or used >=20 > 3) start working on solution + define future roadmap >=20 >=20 > All processes and policies should be on the table for negotiating in > the potential solution. If we need to reinvent the wheel, then let's > do it. >=20 > To be honest, adding more policy just ends up making everyone unhappy > one way or the other. >=20 >=20 There's a potential way to garner a technical solution that somewhat alleviates the need for such rigourous arch testers, and without degrading the stabilisation mechanic to "blind monkey system that stabilises based on conjecture". I've mentioned it before ages ago on the Gentoo Dev list, somewhere. The idea is basically to instrument portage to have an (optional) feature that when turned on, records and submits certain facts about every failed or successful install, with the objective being to essentially spread the load out of what `tatt` does organically over the participant base. 1. Firstly, make no demands of homoegenity or even sanity for a users system to participate. Ever thing they throw at this system I'm about to propose should be considered "valid" 2. Every time a package is installed, or attempted to be installed, the exit of that installation is qualified in one of a number of ways: - installed OK without tests - installed OK with tests - failed tests - failed install - failed compile=20 - failed configure Each of these is a single state in a single field. 3. The Name, Version, and SHA1 of the ebuild that generated the report. 4. The USE flags and any other pertinent ( and carefully selected by Gentoo ) flags are included, each as single fields in a property set, and decomposed into structured property lists where possible. 5. satisfaction data for the target package at the time of installation is recorded. eg: KEYWORDS=3D"arch" + ACCEPT_KEYWORDS=3D"~arch" -> [ "arch(~)" ] KEYWORDS=3D"~arch" + ACCEPT_KEYWORDS=3D"~arch" -> [ "~arch(~)" ] KEYWORDS=3D"arch" + ACCEPT_KEYWORDS=3D"arch" -> [ "arch" ] KEYWORDS=3D"" + ACCEPT_KEYWORDS=3D"**" -> [ "(**)" ] This seems redundant, but this is basically suggesting "hey, if you're insane and setting lots of different arches for accept keywords, that would be relevant data to use to ignore your report. This data can also be used with other data I'll mention later to isolate users with "mixed keywording" setups. 6. For every dependency listed in *DEPEND, a dictionary/hash of "specified atom" -> { name -> resolved dependency name version -> version of resolved dependency arch -> [ satisfied arch spec as in #4 ] sha1 -> Some kind of SHA1 that hopefully turns up in gentoo.git } is recorded in the response at the time of the result. The "satisified arch spec" field is used to isolate anomalies in keywording and user keyword mixing and filter out non-target reports for stabilization data. 7. A Submitter Unique Identifier 8. Possibly a Submitter-Machine Unique Identifier. 9. The whole build log will be included compressed, verbatim. This latter part will an independent option to the "reporting" feature, because its a slightly more invasive privacy concern than the others, in that, arbitrary code execution can leak private data. Hence, people who turn this feature on have to know what they're signing up for. 10. All of the above data is pooled and shipped as a single report, and submitted to a "report server" and aggregated. With all of the above, in the most native of situations, we can use that data at very least to give us a lot more assurance than "well, 30 days passed, and nobody complained", because we'll have a paper trail of a known countable number of successful installs, which while not representative, are likely to still be more diverse and reassuring of confidence than the deafening silence of no feedback. And in non-naive situations, the results for given versions can be aggregated and compared, and factors that are present can be correlated with failures statistically. And this would give us a status board of "here's a bunch of configurations that seem to be statisically more problematic than others, might be worth investigating" But there would be no burden to actually dive into the logs unless you found clusters of failures from different sources failing under the same scenarios ( And this is why not everyone *has* to send build logs to be effective, just enough people have to report "x configuration bad" and some subset of them have to provide elucidating logs ). None of what I mention here is conceptually "new", I've just re-explained the entire CPAN Testers model in terms relevant to Gentoo, using Gentoo parts instead of CPAN parts. And CPAN testers find it *very effective* at being assured they didn't break anything: They ship a TRIAL release ( akin to our ~arch ), and then wait a week or so while people download and test it. And pretty much anyone can become "a tester", there's no barrier to entry, and no requirements for membership. Just install the tools, get yourself an ID, and start installing stuff with tests (the default), and the tools you have will automatically fire off those reports to the hive, and you get a big pretty matrix of "We're good here", and then after no red results in some period, they go "hey, yep, we're good" and ship a stable release. Or maybe occasional pockets of "you dun goofed" where there will be a problem you might have to look into ( sometimes those problems are entirely invalid problems, ... this is somehow typically not an issue ) http://matrix.cpantesters.org/?dist=3DApp-perlbrew+0.76 And you throw variants analysis into the mix and you get those other facts compared and ranked by "Likelihood to be part of the problem" http://analysis.cpantesters.org/solved?distv=3DApp-perlbrew-0.76 ^ you see here variant analysis found 3 common strings in the logs that indicated a failure, and it pointed the finger directly at the failing test as a result. And then in rank #3, you see its pointing a finger at CPAN::Perl::Releases as "a possible problem highly correlated with failures" with the -0.5 theta on version 2.88=20 Lo and behold, automated differential analysis has found the bug:=20 https://rt.cpan.org/Ticket/Display.html?id=3D116517 It still takes a human to=20 a) decide to look b) decide the differential factors are useful enough to pursue=20 c) verify the problem manually by using the guidance given d) manually file the bug But the point here is we can actually build some infrastructure that will give automated tooling some degree of assurance that "this can probably be safely stabilized now, the testers aren't seeing any issues" Its just also the sort of data collection that can lend itself to much more powerful benefits as well. The only hard parts are: 1. Making a good server to handle these reports that scales well 2. Making a good client for report generation, collection from PORTAGE and submission 3. Getting people to turn on the feature 4. Getting enough people using the feature that the majority of the "easy" stabilizations can happen hands-free.=20 And we don't even have to do the "Fancy" parts of it now: Just pools of "package: arch =3D 100pass/0fail archb =3D 10pass/0 fail"= =20 Would be a great start. Because otherwise we're relying 100% on negative feedback, and assuming that the absence of negative feedback is positive, when the reality might be closer that the absence of negative feedback is that the problems were too confusing to report as an explicit bug, the problems faced were deemed unimportant to the person in question and they gave up before they reported it, the user encountered some other entry barrier in reporting, ..... or maybe, nobody is actually using the package at all, so it could actually be completely broken and nobody notices. And it seems entirely hap-hazard to encourage tooling that not *builds* upon that assumption. At least with the manual stabilization process, you can be assured that at least one human will personally install, test, and verify a package works in at least one situation. With a completely automated stabilization that relies on the absence of negative feedback to stabilize, you're *not even getting that*. Why bother with stabilization at all if the entire thing is merely *conjecture* ? Even a broken, flawed stabilization workflow done by teams of people who are bad at testing is better than a stabilization workflow implemented on conjecture of stability :P --Sig_/sjzgruXUYQ+WsgeOwKN4zc8 Content-Type: application/pgp-signature Content-Description: OpenPGP digital signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAEBCAAGBQJXqWsfAAoJEOhUMksTZqggVcYQALbfZOjlgjZML1RlNDX6l9me eoAxvvD3ymXFfDlZ+P2Bf5GZMOAyXLAgcpB1W9Tay2jD+9gCauJ1mb2kf8jqRbg5 yw3AmgnMyVVjg9BOylrDE7Q1j78fUIH4gL2fM/lHIypL2gufABj10nXMyw57bEZc xtt673JY5Cp5ktRFFdSTpRxvq/7BHSgQucNwrkPUs8MTGZCQsrYNTyXSpo0yzRCX apijVAnpR/DNSA8CrPz0ZPSxEnXhHvbY10XqNJQsapjfJUS2m0L89OioZiJOc/YY bol3SchsPBPK8yusz+fP3j4j9XEjJN7Mv56Pycw2s14zBv0u+pBsg/neso3QVBxn pVUFveCwdPYued139w1Xr1bfpxm3RbFLYsanGUkJhrAniuLZQ9vTSsLpS1KiN09+ KooVYqdeimt8xADi7oPmlp87Sr9DF1XcGrOUbQBIAvBU0j+pdFXtFcW0k6cXjI9p BiQVB6weWOQVXApa5Hpx9zeMV2rRq6mBy2igDabj0k2pDI0MLHMO2CPf7sLEi7Zd 3Vs6FEbmHFKc1DHxRL3ZTkespl23EcjIXLgbaettM9uL4TQRb5DqPBfDkmLIc5J7 njvZzyRNH27wU6JVenx1otcXHKGPoHoysJ7Aj22dLz3dF+L35g/NbuxXu7G/feo+ qlU9yLgJDgIGAv1D/pX9 =fwKl -----END PGP SIGNATURE----- --Sig_/sjzgruXUYQ+WsgeOwKN4zc8--