* Re: [gentoo-dev] overlays.gentoo.org restoration & post-mortem
@ 2014-01-18 5:58 99% ` Alec Warner
0 siblings, 0 replies; 1+ results
From: Alec Warner @ 2014-01-18 5:58 UTC (permalink / raw
To: Gentoo Dev
[-- Attachment #1: Type: text/plain, Size: 2196 bytes --]
On Fri, Jan 17, 2014 at 9:23 PM, Kent Fredric <kentfredric@gmail.com> wrote:
>
> On 18 January 2014 18:02, Robin H. Johnson <robbat2@gentoo.org> wrote:
>
>> - More people need to use the infra-status page to learn about the state
>> of Gentoo services.
>>
>
>
> A service middle layer like fastly or cloudflare which could link to the
> infra page would be good here perhaps, so when an outage occurred ( at
> least on the web side ) appropriate links to infra could be given.
>
Cloudly stuff aside (most of infra is not super experienced or trusting of
cloud stuff) I think there was a lot of indecision during the outage.
Do we wait for the sponsor or restore from backup?
How good are the backups (turns out, they were decent?)
How much work is it to rebuild from them (turns out, one evening of Robin's
time + incidentals.)
Once we got the data back on the new machine, why did we post the all
clear? Then we knew there was corruption, but it took a long time to
disable git and http access. Some repos were missing, some were corrupt,
etc.
We don't have procedures for these sorts of things. I think we were
conservative in the changes we made. How do you disable a service like
gitolite? We deployed two fixes. One was to disable ssh for the 'git' user,
the second was to move the authorized keys files out of the way. We pursued
these avenues independently, and we did not check them into configuration
management, which I wish had happened. Later when we disabled the http part
(to make overlays throw 503's) that was checked in, which was nice.
Certainly I was afraid of breaking stuff for Robin, so I really tried to
avoid doing anything unless I was confident it would not impact him.
> And the infra status page is not exactly obvious. Its not listed on the
> "gentoo sites" list on the top right, and perhaps it aught to be.
>
I consider the page a great success in this story. I'm really happy about
it, and while you can always say 'hey we could have done better here' I
think we did pretty well.
-A
>
>
>
>
> --
> Kent
>
> perl -e "print substr( \"edrgmaM SPA NOcomil.ic\\@tfrken\", \$_ * 3, 3 )
> for ( 9,8,0,7,1,6,5,4,3,2 );"
>
> http://kent-fredric.fox.geek.nz
>
[-- Attachment #2: Type: text/html, Size: 3761 bytes --]
^ permalink raw reply [relevance 99%]
Results 1-1 of 1 | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2014-01-18 5:02 [gentoo-dev] overlays.gentoo.org restoration & post-mortem Robin H. Johnson
2014-01-18 5:23 ` Kent Fredric
2014-01-18 5:58 99% ` Alec Warner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox