From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: <gentoo-dev+bounces-64435-garchives=archives.gentoo.org@lists.gentoo.org> Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) by finch.gentoo.org (Postfix) with ESMTP id B099B138247 for <garchives@archives.gentoo.org>; Sat, 18 Jan 2014 05:58:26 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 24585E0A69; Sat, 18 Jan 2014 05:58:13 +0000 (UTC) Received: from mail-we0-f169.google.com (mail-we0-f169.google.com [74.125.82.169]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id F001FE0A5D for <gentoo-dev@lists.gentoo.org>; Sat, 18 Jan 2014 05:58:11 +0000 (UTC) Received: by mail-we0-f169.google.com with SMTP id u57so5380612wes.28 for <gentoo-dev@lists.gentoo.org>; Fri, 17 Jan 2014 21:58:10 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=uP0KMx6KXcFdqy9JecJccfLYCO6KTjRV643eqLqrnCw=; b=jziPWUFSIfB+fBBKS3aN9qW+1Dxb/DQpuv7WawVOxzH6W2nwkoGzNY1ECADkgl70dO z5YixBgnG+ezprGDQ/zshJMrywW7PX1UZRRUgqJGjtvfEMgepOapDNUugOjUvX4Dg/WF L+4soPNASJkvrijkb/LwSoI+6Hfgg3jd/30jL6GEMuvV5BSpM1AIn7ZA8XvbcqHll9fj h023/XrN/OkSv6dCpVJZy7DZzP45rWYJtiiJgumOSxpHvMHBnGQTlrMc4j5LQxzsr0DX J9QfL2r9bPNTvA4awScSU0AX3wgnbkn3D1AJCjLxNcYbmjJyWq2nxcAfUgiMpSEQQEGZ hSfA== X-Gm-Message-State: ALoCoQmlmIQmx42m/BJMr982Ohy98gYdvTw+8Lv0Bvfj6g7V6uM5BXvjZcbMw13GY/vKbhrDvktb Precedence: bulk List-Post: <mailto:gentoo-dev@lists.gentoo.org> List-Help: <mailto:gentoo-dev+help@lists.gentoo.org> List-Unsubscribe: <mailto:gentoo-dev+unsubscribe@lists.gentoo.org> List-Subscribe: <mailto:gentoo-dev+subscribe@lists.gentoo.org> List-Id: Gentoo Linux mail <gentoo-dev.gentoo.org> X-BeenThere: gentoo-dev@lists.gentoo.org Reply-to: gentoo-dev@lists.gentoo.org MIME-Version: 1.0 X-Received: by 10.180.19.165 with SMTP id g5mr1462811wie.31.1390024690444; Fri, 17 Jan 2014 21:58:10 -0800 (PST) Sender: antarus@scriptkitty.com Received: by 10.216.170.129 with HTTP; Fri, 17 Jan 2014 21:58:10 -0800 (PST) X-Originating-IP: [173.8.165.226] In-Reply-To: <CAATnKFBskCUbR5iB895Vr26Ysu0fizJFF7r1R8XrWeenXjuQSQ@mail.gmail.com> References: <20140118050256.GF3378@orbis-terrarum.net> <CAATnKFBskCUbR5iB895Vr26Ysu0fizJFF7r1R8XrWeenXjuQSQ@mail.gmail.com> Date: Fri, 17 Jan 2014 21:58:10 -0800 X-Google-Sender-Auth: MBC87u2cCmv4O8s3-vLmnioOREE Message-ID: <CAAr7Pr9GGy3KsDQJ8bzcJ95NpyUM-q-DMHd2a4FVZ=+UcoDXgw@mail.gmail.com> Subject: Re: [gentoo-dev] overlays.gentoo.org restoration & post-mortem From: Alec Warner <antarus@gentoo.org> To: Gentoo Dev <gentoo-dev@lists.gentoo.org> Content-Type: multipart/alternative; boundary=bcaec53d5281e741ec04f0385729 X-Archives-Salt: f6170b3e-b289-4d10-a87c-3309ad46cacc X-Archives-Hash: 14494ee9dd74c142dca334fef280882e --bcaec53d5281e741ec04f0385729 Content-Type: text/plain; charset=UTF-8 On Fri, Jan 17, 2014 at 9:23 PM, Kent Fredric <kentfredric@gmail.com> wrote: > > On 18 January 2014 18:02, Robin H. Johnson <robbat2@gentoo.org> wrote: > >> - More people need to use the infra-status page to learn about the state >> of Gentoo services. >> > > > A service middle layer like fastly or cloudflare which could link to the > infra page would be good here perhaps, so when an outage occurred ( at > least on the web side ) appropriate links to infra could be given. > Cloudly stuff aside (most of infra is not super experienced or trusting of cloud stuff) I think there was a lot of indecision during the outage. Do we wait for the sponsor or restore from backup? How good are the backups (turns out, they were decent?) How much work is it to rebuild from them (turns out, one evening of Robin's time + incidentals.) Once we got the data back on the new machine, why did we post the all clear? Then we knew there was corruption, but it took a long time to disable git and http access. Some repos were missing, some were corrupt, etc. We don't have procedures for these sorts of things. I think we were conservative in the changes we made. How do you disable a service like gitolite? We deployed two fixes. One was to disable ssh for the 'git' user, the second was to move the authorized keys files out of the way. We pursued these avenues independently, and we did not check them into configuration management, which I wish had happened. Later when we disabled the http part (to make overlays throw 503's) that was checked in, which was nice. Certainly I was afraid of breaking stuff for Robin, so I really tried to avoid doing anything unless I was confident it would not impact him. > And the infra status page is not exactly obvious. Its not listed on the > "gentoo sites" list on the top right, and perhaps it aught to be. > I consider the page a great success in this story. I'm really happy about it, and while you can always say 'hey we could have done better here' I think we did pretty well. -A > > > > > -- > Kent > > perl -e "print substr( \"edrgmaM SPA NOcomil.ic\\@tfrken\", \$_ * 3, 3 ) > for ( 9,8,0,7,1,6,5,4,3,2 );" > > http://kent-fredric.fox.geek.nz > --bcaec53d5281e741ec04f0385729 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable <div dir=3D"ltr"><div class=3D"gmail_extra"><div class=3D"gmail_quote">On F= ri, Jan 17, 2014 at 9:23 PM, Kent Fredric <span dir=3D"ltr"><<a href=3D"= mailto:kentfredric@gmail.com" target=3D"_blank">kentfredric@gmail.com</a>&g= t;</span> wrote:<br> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex"><div dir=3D"ltr"><div class=3D"im"><div clas= s=3D"gmail_extra"><br><div class=3D"gmail_quote">On 18 January 2014 18:02, = Robin H. Johnson <span dir=3D"ltr"><<a href=3D"mailto:robbat2@gentoo.org= " target=3D"_blank">robbat2@gentoo.org</a>></span> wrote:<br> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex"><div style=3D"overflow:hidden">- More people= need to use the infra-status page to learn about the state<br> =C2=A0 of Gentoo services.</div></blockquote></div><br><br></div></div><div= class=3D"gmail_extra">A service middle layer like fastly or cloudflare whi= ch could link to the infra page would be good here perhaps, so when an outa= ge occurred ( at least on the web side ) appropriate links to infra could b= e given.<br> </div></div></blockquote><div><br></div><div>Cloudly stuff aside (most of i= nfra is not super experienced or trusting of cloud stuff) I think there was= a lot of indecision during the outage.</div><div>Do we wait for the sponso= r or restore from backup?</div> <div>How good are the backups (turns out, they were decent?)</div><div>How = much work is it to rebuild from them (turns out, one evening of Robin's= time + incidentals.)</div><div><br></div><div>Once we got the data back on= the new machine, why did we post the all clear? Then we knew there was cor= ruption, but it took a long time to disable git and http access. Some repos= were missing, some were corrupt, etc.</div> <div><br></div><div>We don't have procedures for these sorts of things.= I think we were conservative in the changes we made. How do you disable a = service like gitolite? We deployed two fixes. One was to disable ssh for th= e 'git' user, the second was to move the authorized keys files out = of the way. We pursued these avenues independently, and we did not check th= em into configuration management, which I wish had happened. Later when we = disabled the http part (to make overlays throw 503's) that was checked = in, which was nice.</div> <div>Certainly I was afraid of breaking stuff for Robin, so I really tried = to avoid doing anything unless I was confident it would not impact him.<br>= </div><div><br></div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 = 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> <div dir=3D"ltr"><div class=3D"gmail_extra"> <br></div><div class=3D"gmail_extra">And the infra status page is not exact= ly obvious. Its not listed on the "gentoo sites" list on the top = right, and perhaps it aught to be.</div></div></blockquote><div><br></div> <div>I consider the page a great success in this story. I'm really happ= y about it, and while you can always say 'hey we could have done better= here' I think we did pretty well.</div><div><br></div><div>-A</div> <div>=C2=A0</div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8= ex;border-left:1px #ccc solid;padding-left:1ex"><div dir=3D"ltr"><div class= =3D"gmail_extra"><span class=3D"HOEnZb"><font color=3D"#888888"><br></font>= </span></div> <span class=3D"HOEnZb"><font color=3D"#888888"><div class=3D"gmail_extra"><= br> <br clear=3D"all"></div><div class=3D"gmail_extra"><br>-- <br>Kent <br><br>= perl -e=C2=A0 "print substr( \"edrgmaM=C2=A0 SPA NOcomil.ic\\@tfr= ken\", \$_ * 3, 3 ) for ( 9,8,0,7,1,6,5,4,3,2 );"<br><br><a href= =3D"http://kent-fredric.fox.geek.nz" target=3D"_blank">http://kent-fredric.= fox.geek.nz</a> </div></font></span></div> </blockquote></div><br></div></div> --bcaec53d5281e741ec04f0385729--