From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from lists.gentoo.org ([140.105.134.102] helo=robin.gentoo.org)
	by nuthatch.gentoo.org with esmtp (Exim 4.54)
	id 1Ejs8l-0005p8-Qo
	for garchives@archives.gentoo.org; Wed, 07 Dec 2005 05:46:56 +0000
Received: from robin.gentoo.org (localhost [127.0.0.1])
	by robin.gentoo.org (8.13.5/8.13.5) with SMTP id jB75ihhZ004916;
	Wed, 7 Dec 2005 05:44:43 GMT
Received: from imf20aec.mail.bellsouth.net (imf20aec.mail.bellsouth.net [205.152.59.68])
	by robin.gentoo.org (8.13.5/8.13.5) with ESMTP id jB75igMe025277
	for <gentoo-amd64@lists.gentoo.org>; Wed, 7 Dec 2005 05:44:42 GMT
Received: from ibm62aec.bellsouth.net ([70.149.252.120])
          by imf20aec.mail.bellsouth.net with ESMTP
          id <20051207054441.WHGF16266.imf20aec.mail.bellsouth.net@ibm62aec.bellsouth.net>
          for <gentoo-amd64@lists.gentoo.org>;
          Wed, 7 Dec 2005 00:44:41 -0500
Received: from [192.168.1.100] (really [70.149.252.120])
          by ibm62aec.bellsouth.net with ESMTP
          id <20051207054441.JLZG17681.ibm62aec.bellsouth.net@[192.168.1.100]>
          for <gentoo-amd64@lists.gentoo.org>;
          Wed, 7 Dec 2005 00:44:41 -0500
Date: Tue, 6 Dec 2005 23:44:40 -0600 (CST)
From: Deedra Waters <dmwaters@gentoo.org>
To: gentoo-amd64@lists.gentoo.org
Subject: Re: [gentoo-amd64] mce log errors
In-Reply-To: <Pine.LNX.4.64.0512062202210.6176@monster>
Message-ID: <Pine.LNX.4.64.0512062343090.6137@monster>
References: <Pine.LNX.4.64.0512061447070.6176@monster>
 <1133912388.13889.49.camel@athena.fprintf.net> <Pine.LNX.4.64.0512062202210.6176@monster>
Precedence: bulk
List-Post: <mailto:gentoo-amd64@lists.gentoo.org>
List-Help: <mailto:gentoo-amd64+help@gentoo.org>
List-Unsubscribe: <mailto:gentoo-amd64+unsubscribe@gentoo.org>
List-Subscribe: <mailto:gentoo-amd64+subscribe@gentoo.org>
List-Id: Gentoo Linux mail <gentoo-amd64.gentoo.org>
X-BeenThere: gentoo-amd64@gentoo.org
Reply-to: gentoo-amd64@lists.gentoo.org
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Archives-Salt: d922593f-063a-49e6-8335-fa6b5476ec9e
X-Archives-Hash: 9d092ea8d107d1b018c64fb3cfbfb69a

Hrm, it doesn't look to be a heat issue. I opened the case and took a
look inside when i saw the last message,and it looks perfectly cool and
happy.

I do notice however, that it's only happening when i hammer the raid
array, which is on a pci promise controler.

On Tue, 6 Dec 2005, Deedra Waters wrote:

> Date: Tue, 6 Dec 2005 22:04:50 -0600 (CST)
> From: Deedra Waters <dmwaters@gentoo.org>
> Reply-To: gentoo-amd64@lists.gentoo.org
> To: gentoo-amd64@lists.gentoo.org
> Subject: Re: [gentoo-amd64] mce log errors
>
> Is there a way to test that fact? I've tried to work with lm_sensors,
> but the readings for that are way way off. So, considering  lm_sensors
> isuseless is there another way to tell if overheating is the problem?
>
> The case itself has a lot of fans, but it's also got 5 harddrives in it.
> On Tue, 6 Dec 2005, Daniel Gryniewicz wrote:
>
> > Date: Tue, 06 Dec 2005 18:39:48 -0500
> > From: Daniel Gryniewicz <dang@gentoo.org>
> > Reply-To: gentoo-amd64@lists.gentoo.org
> > To: gentoo-amd64@lists.gentoo.org
> > Subject: Re: [gentoo-amd64] mce log errors
> >
> > On Tue, 2005-12-06 at 14:56 -0600, Deedra Waters wrote:
> > > All,
> > >
> > > I'm getting a lot of these, but it only seems to happen when i put the
> > > machine under a lot of stress, and even then it's not always happening.
> > > This machine is a duel opteron 242, the board is an asus k8, and with
> > > the latest bios update, the machine has  no real problems at all.
> > >
> > > MCE 1
> > > CPU 0 4 northbridge TSC 8f1a7b270b6f
> > > ADDR 75c3320
> > >   Northbridge ECC error
> > >   ECC syndrome = 62
> > >        bit32 = err cpu0
> > >        bit46 = corrected ecc error
> > >   bus error 'local node origin, request didn't time out
> > >       generic read mem transaction
> > >       memory access, level generic'
> > > STATUS 9431400100000813 MCGSTATUS 0
> > > MCE 2
> > > CPU 0 2 bus unit TSC 8f8ad2325db7
> > >   L2 cache ECC error
> > >   Bus or cache array error
> > >        bit46 = corrected ecc error
> > >        bit62 = error overflow (multiple errors)
> > >   bus error 'local node origin, request didn't time out
> > >       prefetch mem transaction
> > >       memory access, level generic'
> > > STATUS d000400000000863 MCGSTATUS 0
> >
> > CPU cache is getting ECC errors.  Smells like overheating.
> >
> > Daniel
> >
>
>

-- 
Deedra Waters - Gentoo developer relations, accessibility and infrastructure -
dmwaters@gentoo.org
Gentoo linux: http://www.gentoo.org

-- 
gentoo-amd64@gentoo.org mailing list