From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from pigeon.gentoo.org ([208.92.234.80] helo=lists.gentoo.org)
	by finch.gentoo.org with esmtp (Exim 4.60)
	(envelope-from <gentoo-user+bounces-115080-garchives=archives.gentoo.org@lists.gentoo.org>)
	id 1OyE55-0004l1-05
	for garchives@archives.gentoo.org; Wed, 22 Sep 2010 01:25:07 +0000
Received: from pigeon.gentoo.org (localhost [127.0.0.1])
	by pigeon.gentoo.org (Postfix) with SMTP id 3D7B7E0A6A;
	Wed, 22 Sep 2010 01:24:40 +0000 (UTC)
Received: from mail-iw0-f181.google.com (mail-iw0-f181.google.com [209.85.214.181])
	by pigeon.gentoo.org (Postfix) with ESMTP id 15E0DE0A6A
	for <gentoo-user@lists.gentoo.org>; Wed, 22 Sep 2010 01:24:39 +0000 (UTC)
Received: by iwn39 with SMTP id 39so52919iwn.40
        for <gentoo-user@lists.gentoo.org>; Tue, 21 Sep 2010 18:24:39 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=gamma;
        h=domainkey-signature:mime-version:received:received:in-reply-to
         :references:date:message-id:subject:from:to:content-type
         :content-transfer-encoding;
        bh=gcUPJwKZu7U1rq2bDiwHQKllL4+vX6Iuw+xLP4Y/pH8=;
        b=Wiz9mtjczFgbtcO+vecMjxyX3t83i5sMvED4ROwof3Wl1f1FogBf80C5e+iCVN93Xa
         wxNhSTLoiPm4UnW0afVfH3lx4HLytvlWJQyOv1f+Wp4KvpPPcepRT+ziPnslyuZWTbp8
         QKbM2PdFXgB0f+5E6nrP8v5u5qg1p1OqNwb7A=
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=mime-version:in-reply-to:references:date:message-id:subject:from:to
         :content-type:content-transfer-encoding;
        b=itN+J9MFC0VHN8WUYsQxfkU5iXSrt+QD7LmjZ2H/UeXf/iLgTfH97vFmHOyqrvzTi5
         0rRT6q+SIKUS0jI3CWl/55spkhEf7W6B/LDjFAUTtUuM2W6c46AwOUHjQ7GbKfpTTawc
         R9un7T6kJg88pcTxnffbS2rPO2WT6nPUiyDs4=
Precedence: bulk
List-Post: <mailto:gentoo-user@lists.gentoo.org>
List-Help: <mailto:gentoo-user+help@lists.gentoo.org>
List-Unsubscribe: <mailto:gentoo-user+unsubscribe@lists.gentoo.org>
List-Subscribe: <mailto:gentoo-user+subscribe@lists.gentoo.org>
List-Id: Gentoo Linux mail <gentoo-user.gentoo.org>
X-BeenThere: gentoo-user@lists.gentoo.org
Reply-to: gentoo-user@lists.gentoo.org
MIME-Version: 1.0
Received: by 10.42.2.81 with SMTP id 17mr573715icj.40.1285118679262; Tue, 21
 Sep 2010 18:24:39 -0700 (PDT)
Received: by 10.42.6.130 with HTTP; Tue, 21 Sep 2010 18:24:39 -0700 (PDT)
In-Reply-To: <201009212233.05120.michaelkintzios@gmail.com>
References: <AANLkTimfpE6_hXSFObJ5Ycw6+Eesj3+MTkTmi=jvDkGU@mail.gmail.com>
	<AANLkTim-98DANhL9dCH3miLLAkb4jfj_bBhV3iPmL3QC@mail.gmail.com>
	<874A0175-FD4B-4809-898C-302C18DBFA71@stellar.eclipse.co.uk>
	<201009212233.05120.michaelkintzios@gmail.com>
Date: Tue, 21 Sep 2010 18:24:39 -0700
Message-ID: <AANLkTinnoW3JJhA+FeysR-p6uytcx-TMj=P3uT5-9kzy@mail.gmail.com>
Subject: Re: [gentoo-user] machine check exception errors
From: Grant <emailgrant@gmail.com>
To: gentoo-user@lists.gentoo.org
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
X-Archives-Salt: 4daa120c-c943-4026-b5f4-4275e4f3f8d2
X-Archives-Hash: bf85692fe436056c6b3dde9b443f66ec

>> >>>> I'm getting a lot of machine check exception errors in dmesg on my
>> >>>> hosted server. =A0Running mcelog I get:
>> >>>> ...
>> >
>> > They offered to take my machine down and do a memory test which they
>> > said would take a number of hours. =A0Is a memory test likely to help?
>> > Did you suggest reseating or replacing RAM modules as opposed to a
>> > memory test because it will result in less downtime?
>>
>> I suspect that your hosting provider are offering you this memory test
>> because they don't want to go swapping out memory modules willy-nilly.
>>
>> How do they know that the problem is really memory, and not your operati=
ng
>> system? If they take all this RAM out and put new RAM in, what do they d=
o
>> with the old RAM? They don't know if it's good or bad, so are they
>> expected to just slap it in a server belonging to another customer, and
>> stitch him up?
>>
>> A memory test is likely to identify bad RAM, if it is bad, so you should
>> proceed with this. This is likely the best route to solving the problem.
>>
>> I think that ideally, for you, they would move the system image onto a
>> different known-good server with the same configuration. Then you cannot
>> complain if the same problems start occurring again. If the problem is
>> genuinely hardware then they won't. And the hosting provider is free to
>> run diagnostics on your old machine.
>>
>> But realistically, the memory test is likely to show up a bad RAM module=
,
>> you'll get it replaced and be up and running within a few hours. Why wou=
ld
>> you refuse? If your system needed a guaranteed uptime you'd perhaps have
>> to pay for a higher level of service than the fees you're paying at
>> present.
>
> I run memory tests overnight. =A0If a module is seriously borked then it =
will
> fail earlier. =A0Reseating/replacing takes a few minutes, instead of hour=
s.
>
> If they have spare machines (for dev't or testing) they can fit the memor=
y
> module(s) there and test them exhaustively, before they put the good ones=
 back
> into a customer's machine.

Thanks Mick and Stroller.  I'll see if they'll go for this.

- Grant