From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from pigeon.gentoo.org ([208.92.234.80] helo=lists.gentoo.org)
	by finch.gentoo.org with esmtp (Exim 4.60)
	(envelope-from <gentoo-user+bounces-115087-garchives=archives.gentoo.org@lists.gentoo.org>)
	id 1OyLV5-000106-6V
	for garchives@archives.gentoo.org; Wed, 22 Sep 2010 09:20:27 +0000
Received: from pigeon.gentoo.org (localhost [127.0.0.1])
	by pigeon.gentoo.org (Postfix) with SMTP id 5A954E0AAC;
	Wed, 22 Sep 2010 09:20:02 +0000 (UTC)
Received: from mail-wy0-f181.google.com (mail-wy0-f181.google.com [74.125.82.181])
	by pigeon.gentoo.org (Postfix) with ESMTP id 1D003E0AAC
	for <gentoo-user@lists.gentoo.org>; Wed, 22 Sep 2010 09:20:01 +0000 (UTC)
Received: by wyf28 with SMTP id 28so372159wyf.40
        for <gentoo-user@lists.gentoo.org>; Wed, 22 Sep 2010 02:20:01 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=gamma;
        h=domainkey-signature:received:received:from:reply-to:to:subject:date
         :user-agent:references:in-reply-to:mime-version:content-type
         :content-transfer-encoding:message-id;
        bh=ts8V8Yhcnt5M6rF88RvvODUSE0X41IxtCW9SB7g3rv4=;
        b=KyUtfqD9FG5uxBhl1rLq52YjX2ciYZOiPQbYuiEOLYVpkgdZsEejS37v9zoglZmNpO
         HQo2k7mslhHEU9iFmg7tvBT2TL5PEM4QDPok0LOlq2fAuHsCWb0v8LtWFueTxI3AWr+Z
         nRP1+zH7p7F34JssYIMMFi1TR6kHIjEeLRVVs=
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=from:reply-to:to:subject:date:user-agent:references:in-reply-to
         :mime-version:content-type:content-transfer-encoding:message-id;
        b=WpYe1144XttEqoE3xpqATc/h32kCkxdHuLYIgLnR+PE2BbHgBq+PyAdcvUxerxpeDr
         paAQFl+95MqhprtoO/gIW/utJFjeKR01tdRFg6kkLe9vygEOF98WNAOip/LcR1/KR266
         Ggly6zEExYAo7yVABsFOjflsuN7fahioutCM4=
Received: by 10.216.54.16 with SMTP id h16mr10387970wec.6.1285147201103;
        Wed, 22 Sep 2010 02:20:01 -0700 (PDT)
Received: from  (230.3.169.217.in-addr.arpa [217.169.3.230])
        by mx.google.com with ESMTPS id n17sm6678513weq.30.2010.09.22.02.19.59
        (version=TLSv1/SSLv3 cipher=RC4-MD5);
        Wed, 22 Sep 2010 02:20:00 -0700 (PDT)
From: Mick <michaelkintzios@gmail.com>
To: gentoo-user@lists.gentoo.org
Subject: Re: [gentoo-user] machine check exception errors
Date: Wed, 22 Sep 2010 10:19:45 +0100
User-Agent: KMail/1.13.5 (Linux/2.6.34-gentoo-r6; KDE/4.4.5; x86_64; ; )
References: <AANLkTimfpE6_hXSFObJ5Ycw6+Eesj3+MTkTmi=jvDkGU@mail.gmail.com> <201009212233.05120.michaelkintzios@gmail.com> <AANLkTinnoW3JJhA+FeysR-p6uytcx-TMj=P3uT5-9kzy@mail.gmail.com>
In-Reply-To: <AANLkTinnoW3JJhA+FeysR-p6uytcx-TMj=P3uT5-9kzy@mail.gmail.com>
Precedence: bulk
List-Post: <mailto:gentoo-user@lists.gentoo.org>
List-Help: <mailto:gentoo-user+help@lists.gentoo.org>
List-Unsubscribe: <mailto:gentoo-user+unsubscribe@lists.gentoo.org>
List-Subscribe: <mailto:gentoo-user+subscribe@lists.gentoo.org>
List-Id: Gentoo Linux mail <gentoo-user.gentoo.org>
X-BeenThere: gentoo-user@lists.gentoo.org
Reply-to: gentoo-user@lists.gentoo.org
MIME-Version: 1.0
Content-Type: multipart/signed;
  boundary="nextPart2243117.1ThYMLDMh9";
  protocol="application/pgp-signature";
  micalg=pgp-sha1
Content-Transfer-Encoding: 7bit
Message-Id: <201009221019.56794.michaelkintzios@gmail.com>
X-Archives-Salt: 6a6fc419-e19f-4a7d-8be5-3c57c947c845
X-Archives-Hash: c2236e7cffa5d4272d8275a3a5345e80

--nextPart2243117.1ThYMLDMh9
Content-Type: Text/Plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

On Wednesday 22 September 2010 02:24:39 Grant wrote:
> >> >>>> I'm getting a lot of machine check exception errors in dmesg on my
> >> >>>> hosted server.  Running mcelog I get:
> >> >>>> ...
> >> >=20
> >> > They offered to take my machine down and do a memory test which they
> >> > said would take a number of hours.  Is a memory test likely to help?
> >> > Did you suggest reseating or replacing RAM modules as opposed to a
> >> > memory test because it will result in less downtime?
> >>=20
> >> I suspect that your hosting provider are offering you this memory test
> >> because they don't want to go swapping out memory modules willy-nilly.
> >>=20
> >> How do they know that the problem is really memory, and not your
> >> operating system? If they take all this RAM out and put new RAM in,
> >> what do they do with the old RAM? They don't know if it's good or bad,
> >> so are they expected to just slap it in a server belonging to another
> >> customer, and stitch him up?
> >>=20
> >> A memory test is likely to identify bad RAM, if it is bad, so you shou=
ld
> >> proceed with this. This is likely the best route to solving the proble=
m.
> >>=20
> >> I think that ideally, for you, they would move the system image onto a
> >> different known-good server with the same configuration. Then you cann=
ot
> >> complain if the same problems start occurring again. If the problem is
> >> genuinely hardware then they won't. And the hosting provider is free to
> >> run diagnostics on your old machine.
> >>=20
> >> But realistically, the memory test is likely to show up a bad RAM
> >> module, you'll get it replaced and be up and running within a few
> >> hours. Why would you refuse? If your system needed a guaranteed uptime
> >> you'd perhaps have to pay for a higher level of service than the fees
> >> you're paying at present.
> >=20
> > I run memory tests overnight.  If a module is seriously borked then it
> > will fail earlier.  Reseating/replacing takes a few minutes, instead of
> > hours.
> >=20
> > If they have spare machines (for dev't or testing) they can fit the
> > memory module(s) there and test them exhaustively, before they put the
> > good ones back into a customer's machine.
>=20
> Thanks Mick and Stroller.  I'll see if they'll go for this.

You're welcome.  Bear in mind though that a lot of hosters are just glorifi=
ed=20
resellers with an account in a bigger data centre.  In many cases they do n=
ot=20
even have physical access to the machines.  Only the data centre techies do=
=20
and they may be less willing to oblige and break procedure or routine, just=
=20
because one end user out of hundreds/thousands complained about some memory=
=20
errors.

YMMV
=2D-=20
Regards,
Mick

--nextPart2243117.1ThYMLDMh9
Content-Type: application/pgp-signature; name=signature.asc 
Content-Description: This is a digitally signed message part.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.16 (GNU/Linux)

iEYEABECAAYFAkyZyjwACgkQVTDTR3kpaLa9WACgg7E6WvcaaFlv/JM99opBJOSr
+uEAnjjhYaDyx/pODYnBxLVdNyoIRNmH
=lv7+
-----END PGP SIGNATURE-----

--nextPart2243117.1ThYMLDMh9--