From: Alec Warner <antarus@gentoo.org>
To: Gentoo Dev <gentoo-dev@lists.gentoo.org>
Subject: Re: [gentoo-dev] overlays.gentoo.org restoration & post-mortem
Date: Fri, 17 Jan 2014 22:01:52 -0800 [thread overview]
Message-ID: <CAAr7Pr8N1bHVVYihpjVntZ_AdED-8Fh30e=Aknh8=-rFDkBi9g@mail.gmail.com> (raw)
In-Reply-To: <20140118050256.GF3378@orbis-terrarum.net>
[-- Attachment #1: Type: text/plain, Size: 5664 bytes --]
On Fri, Jan 17, 2014 at 9:02 PM, Robin H. Johnson <robbat2@gentoo.org>wrote:
> overlays.gentoo.org service has been restored on a new system.
> Some statistics and a post-mortem follow.
>
> Special thanks to antarus and a3li for all their interactions with our
> sponsor,
> and managing most of the details. I just did the final data recovery and
> this
> writeup.
>
> Please resume using the service, and if you see something weird that you
> think is different from before, please file a bug for Infrastructure.
>
> In the process, the service moved to a new machine. The SSH keys have
> changed
> as follows:
> DSA: d6:71:99:1f:46:c9:42:95:e1:9d:be:8e:f7:76:51:b5
> RSA: 92:b5:40:16:63:a3:61:9f:d7:63:64:ba:d5:51:41:b9
> ECDSA: 96:f0:29:e6:d4:85:58:46:31:ba:0e:17:0b:8c:fa:d8
>
> As this time, we will NOT be restoring Trac due to low demand. If you
> still require an web-based SVN browser for old SVN repos, please contact
> us at infra@gentoo.org.
>
For Trac wiki users. The recommendation is to move to wiki.gentoo.org. If
you hadn't migrated, and you need a copy of your Trac wiki pages from
overlays.gentoo.org, please file a bug against infra and someone (me) will
restore them for on a request by request basis. I think the deal is that I
can pretty trivially give you a tarball of markup files (one per wiki page.)
-A
>
> If you have a dev/ repo under the list 'IMPORTANT' below, you MUST push
> to the server again.
>
> IMPORTANT: The following repos were damaged beyond repair, and were not
> available in backups. You'll need to push again, I have reset the repos to
> empty:
> dev/anarchy.git
> dev/dberkholz.git
> dev/dev-zero.git
> dev/dilfridge.git
> dev/fordfrog.git
> dev/graaff.git
> dev/maekke.git
> dev/mschiff.git
> dev/quantumsummers.git
> dev/zorry.git
>
> FYI: The following repos appeared to be empty:
> dev/b33fc0d3.git
> dev/moult.git
> dev/tomwij.git
> user/blueicefield.git
> user/disinbox.git
> user/palatis.git
> user/paragon.git
> user/vmalov.git
> user/xray.git
>
> FYI: The following repos contained dangling commits/tags/blobs, and this
> should not be considered new breakage; if you have a newer copy, you are
> encouraged to push again:
> dev/blueness.git
> dev/maksbotan.git
> dev/mgorny.git
> dev/qiaomuf.git
> dev/xmw.git
> proj/betagarden.git
> proj/catalyst.git (+tags)
> proj/devmanual.git
> proj/dotnet.git
> proj/elfix.git (+tags)
> proj/emacs-tools.git
> proj/gamerlay.git
> proj/hardened-dev.git
> proj/hardened-patchset.git
> proj/kde.git
> proj/lisp.git
> proj/openrc.git (+tags)
> proj/portage.git
> proj/ruby-overlay.git
> proj/sci.git
> proj/sunrise.git
> proj/webapp-config.git
> proj/x11.git
> user/gmt.git
> user/mv.git (+blobs)
> user/palmer.git
>
> Statistics:
> -----------
> 354 repos total
> - 10 repos unrecoverable (all in /dev)
> = 344 repos recovered/available
>
> 9 repos that seem to empty
> 26 repos with dangling commits/tags/blobs
> 2 repos recovered from external sources.
>
> Breakdown by path:
> ------------------
> 193 proj/ repos
> 69 dev/ repos
> 91 user/ repos
> 1 other repo
>
> Post-mortem
> -----------
> Hornbill went offline around: 2014-01-10 13:13 UTC
> Hornbill last started a backup of VCS: 2014-01-10 07:59:04 UTC
> Hornbill last completed a backup of VCS: 2014-01-10 08:20:54 UTC
>
> Between the backup starting, and the server going offline, we were able
> to confirm writes to the following Git repos:
> dev/fordfrog.git
> proj/kde.git
> gitolite-admin.git
>
> We believe that there were no writes to user/ repos, but are not 100%
> certain, as the logging was insufficient for this purpose.
>
> Hornbill went offline just over a week ago: Mid-afternoon on a Friday
> for the timezone where it's located. Due staff turnover and business
> changes at the previous sponsor, we were not able to contact anybody
> until regular office hours on Monday, January 13th.
>
> The server in question, while previously functioning, was not
> recoverable after a remote hands reboot on Monday afternoon (UTC).
> On Tuesday, more the sponsor was able to examine in it more depth, and
> it was not recoverable. More concealingly, it turned out to be one of
> the few remaining Gentoo infrastructure systems with IDE drives. The
> data was recovered, however it seemed to have a lot of corruption.
>
> It was noted that our backups were missing all of the dev/ repos, due to
> a system-wide rule to exclude /dev/ from backups (the rule should only
> be the real /dev, not any directory simply named "dev"). For this
> reason, we decided to try and get the data from the old server.
>
> Verification/recovery of the remaining data was also hampered by
> confirming that some of the Git repos in the backup were not entirely
> clean, containing legacy errors that turned out to be false positives
> from their CVS/SVN conversions, or dangling commits/blobs/tags.
>
> What could we do better next time:
> ----------------------------------
> - Have backups of all repos!
> - Compare the age of the backup immediately, and consider going live
> with the backup. Only 5 hours of work would have been lost, and even
> then possibly only temporarily, due to the distributed nature of Git.
> - More people need to use the infra-status page to learn about the state
> of Gentoo services.
>
> Actions for Infra
> -----------------
> - Include dev/ repos were not in the backup
> - Set up Gitolite mirroring
> - Review gitolite logging (needs to be easier to confirm when writes
> took place)
>
> --
> Robin Hugh Johnson
> Gentoo Linux: Developer, Infrastructure Lead
> E-Mail : robbat2@gentoo.org
> GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
>
[-- Attachment #2: Type: text/html, Size: 6925 bytes --]
next prev parent reply other threads:[~2014-01-18 6:02 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-01-18 5:02 [gentoo-dev] overlays.gentoo.org restoration & post-mortem Robin H. Johnson
2014-01-18 5:23 ` Kent Fredric
2014-01-18 5:58 ` Alec Warner
2014-01-18 7:04 ` Patrick Lauer
2014-01-18 7:10 ` Alan McKinnon
2014-01-18 7:49 ` Alec Warner
2014-01-18 10:37 ` Alan McKinnon
2014-01-18 12:59 ` Alex Legler
2014-01-18 13:03 ` Markos Chandras
2014-01-18 18:48 ` [gentoo-dev] " Duncan
2014-01-18 6:01 ` Alec Warner [this message]
2014-01-18 10:57 ` Martin Vaeth
2014-01-18 15:11 ` Alex Xu
2014-01-18 15:26 ` [gentoo-dev] " Tom Wijsman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAAr7Pr8N1bHVVYihpjVntZ_AdED-8Fh30e=Aknh8=-rFDkBi9g@mail.gmail.com' \
--to=antarus@gentoo.org \
--cc=gentoo-dev@lists.gentoo.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox