From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by finch.gentoo.org (Postfix) with ESMTPS id 01A891382C5 for ; Fri, 16 Apr 2021 15:12:07 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id ACF8EE084A; Fri, 16 Apr 2021 15:12:00 +0000 (UTC) Received: from smtp.gentoo.org (smtp.gentoo.org [140.211.166.183]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id 5CC0DE0823; Fri, 16 Apr 2021 15:11:28 +0000 (UTC) Message-ID: <192cac75b99ca81111f4714bff0490a3e0d5a047.camel@gentoo.org> Subject: [gentoo-dev-announce] Incoming NATTkA upgrade From: =?UTF-8?Q?Micha=C5=82_G=C3=B3rny?= Reply-To: gentoo-dev@lists.gentoo.org To: gentoo-dev-announce@lists.gentoo.org Cc: gentoo-dev@lists.gentoo.org Date: Fri, 16 Apr 2021 17:11:22 +0200 Organization: Gentoo Content-Type: text/plain; charset="UTF-8" User-Agent: Evolution 3.38.4 Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo development announcement list X-BeenThere: gentoo-dev-announce@lists.gentoo.org X-Auto-Response-Suppress: DR, RN, NRN, OOF, AutoReply MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Archives-Salt: 109961fc-6bbd-4402-89a6-adfd7316d9f2 X-Archives-Hash: 2277295ac66211c3a7cf1cba16339167 Hello, everyone. TL;DR: 1. There has been a few NATTkA misfires around 2 PM UTC today. I'm sorry for the noise. 2. In the next hour, a major NATTkA + pkgcore upgrade should roll out. No problems are expected but please contact me if you see weird behavior after the upgrade (especially incorrect sanity-check results). 3. A workaround has been added that should hopefully finally fix occasional misbehavior due to Bugzilla race conditions. As a side effect, NATTkA may be a bit slower in responding to new bugs (up to 4 minutes of delay). Full explanation follows. Infra's been running an old version of NATTkA for quite some time. The previous upgrade attempt (that involved an incompatible pkgcheck API change) failed due to some cryptic bugs. A lot of stable/keywording requests suddenly started failing -- and it seemed that pkgcheck was checking keyworded ebuilds in the temporary against old dependencies in /usr/portage. I've been doing some new development in NATTkA today, and in order to deploy it cleanly I've finally decided to try figuring out what's wrong with new NATTkA + pkgcore. I've installed the new versions on martin (the Infra host that used to run NATTkA in the past), and started testing them. I didn't notice that puppet has failed to remove the old NATTkA cronjob from martin. So when NATTkA was installed again, the cronjob started running the broken NATTkA version, and it started fighting with the correct instance over bugs. As a result, a few bugs has seen ping- pong between sanity-check+ and sanity-check- results. After noticing the problem, I've removed the old cronjob. I apologize for the bugspam caused by this. Good news is that I've discovered that upgrading to the latest ~arch pkgcore & co. (unmasked versions) resolves the problem in question. Since NATTkA is run on a different host than other services requiring old pkgcore, I am going to deploy the full set of new versions shortly. The initial testing run didn't yield any suspicious results, so hopefully there will be no major problems this time. The new version also includes a workaround for weird NATTkA behavior -- you might have noticed in the past that NATTkA was readding arch teams to fixed stabilization requests, or that today it reverted 'package list' to an earlier state while expanding it. I've been trying to figure out what's wrong with NATTkA's logic for a long time, and I've finally came to the conclusion that the problem is actually in Bugzilla. I haven't verified the exact cause but it's most likely that Bugzilla is executing multiple SELECT queries while performing the bug search, and therefore could end up with combination of bug properties before and after an update. This is the only way I can explain bug #779535. In a single action, CC-ARCHES was added to the bug and the package list was changed. However, NATTkA has reverted to the old package list while expanding -- which can happen only if the bug had CC-ARCHES already. Both keywords and package list is grabbed from Bugzilla via a single REST API query, so my only explanation for this is that Bugzilla API has returned new keywords but old package list. To avoid this, NATTkA now skips bugs that were updated later than 60 seconds prior to running the search. These bugs will be deferred to the next run (i.e. 4 minutes later), and Bugzilla should sync up until then. Of course, this is going to work only if the 'last change time' field is updated no later than other bug data. If you have any questions or problems, please do not hesitate to contact me or report a bug (either on Gentoo Bugzilla, or on NATTkA's GitHub issue tracker). That said, I realize there's a quite a number of problems reported already, and I hope I'll be able to start addressing them ~next month. [1] https://bugs.gentoo.org/779535#c8 -- Best regards, Michał Górny