From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from lists.gentoo.org ([140.105.134.102] helo=robin.gentoo.org) by nuthatch.gentoo.org with esmtp (Exim 4.54) id 1Ejs8l-0005p8-Qo for garchives@archives.gentoo.org; Wed, 07 Dec 2005 05:46:56 +0000 Received: from robin.gentoo.org (localhost [127.0.0.1]) by robin.gentoo.org (8.13.5/8.13.5) with SMTP id jB75ihhZ004916; Wed, 7 Dec 2005 05:44:43 GMT Received: from imf20aec.mail.bellsouth.net (imf20aec.mail.bellsouth.net [205.152.59.68]) by robin.gentoo.org (8.13.5/8.13.5) with ESMTP id jB75igMe025277 for <gentoo-amd64@lists.gentoo.org>; Wed, 7 Dec 2005 05:44:42 GMT Received: from ibm62aec.bellsouth.net ([70.149.252.120]) by imf20aec.mail.bellsouth.net with ESMTP id <20051207054441.WHGF16266.imf20aec.mail.bellsouth.net@ibm62aec.bellsouth.net> for <gentoo-amd64@lists.gentoo.org>; Wed, 7 Dec 2005 00:44:41 -0500 Received: from [192.168.1.100] (really [70.149.252.120]) by ibm62aec.bellsouth.net with ESMTP id <20051207054441.JLZG17681.ibm62aec.bellsouth.net@[192.168.1.100]> for <gentoo-amd64@lists.gentoo.org>; Wed, 7 Dec 2005 00:44:41 -0500 Date: Tue, 6 Dec 2005 23:44:40 -0600 (CST) From: Deedra Waters <dmwaters@gentoo.org> To: gentoo-amd64@lists.gentoo.org Subject: Re: [gentoo-amd64] mce log errors In-Reply-To: <Pine.LNX.4.64.0512062202210.6176@monster> Message-ID: <Pine.LNX.4.64.0512062343090.6137@monster> References: <Pine.LNX.4.64.0512061447070.6176@monster> <1133912388.13889.49.camel@athena.fprintf.net> <Pine.LNX.4.64.0512062202210.6176@monster> Precedence: bulk List-Post: <mailto:gentoo-amd64@lists.gentoo.org> List-Help: <mailto:gentoo-amd64+help@gentoo.org> List-Unsubscribe: <mailto:gentoo-amd64+unsubscribe@gentoo.org> List-Subscribe: <mailto:gentoo-amd64+subscribe@gentoo.org> List-Id: Gentoo Linux mail <gentoo-amd64.gentoo.org> X-BeenThere: gentoo-amd64@gentoo.org Reply-to: gentoo-amd64@lists.gentoo.org MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Archives-Salt: d922593f-063a-49e6-8335-fa6b5476ec9e X-Archives-Hash: 9d092ea8d107d1b018c64fb3cfbfb69a Hrm, it doesn't look to be a heat issue. I opened the case and took a look inside when i saw the last message,and it looks perfectly cool and happy. I do notice however, that it's only happening when i hammer the raid array, which is on a pci promise controler. On Tue, 6 Dec 2005, Deedra Waters wrote: > Date: Tue, 6 Dec 2005 22:04:50 -0600 (CST) > From: Deedra Waters <dmwaters@gentoo.org> > Reply-To: gentoo-amd64@lists.gentoo.org > To: gentoo-amd64@lists.gentoo.org > Subject: Re: [gentoo-amd64] mce log errors > > Is there a way to test that fact? I've tried to work with lm_sensors, > but the readings for that are way way off. So, considering lm_sensors > isuseless is there another way to tell if overheating is the problem? > > The case itself has a lot of fans, but it's also got 5 harddrives in it. > On Tue, 6 Dec 2005, Daniel Gryniewicz wrote: > > > Date: Tue, 06 Dec 2005 18:39:48 -0500 > > From: Daniel Gryniewicz <dang@gentoo.org> > > Reply-To: gentoo-amd64@lists.gentoo.org > > To: gentoo-amd64@lists.gentoo.org > > Subject: Re: [gentoo-amd64] mce log errors > > > > On Tue, 2005-12-06 at 14:56 -0600, Deedra Waters wrote: > > > All, > > > > > > I'm getting a lot of these, but it only seems to happen when i put the > > > machine under a lot of stress, and even then it's not always happening. > > > This machine is a duel opteron 242, the board is an asus k8, and with > > > the latest bios update, the machine has no real problems at all. > > > > > > MCE 1 > > > CPU 0 4 northbridge TSC 8f1a7b270b6f > > > ADDR 75c3320 > > > Northbridge ECC error > > > ECC syndrome = 62 > > > bit32 = err cpu0 > > > bit46 = corrected ecc error > > > bus error 'local node origin, request didn't time out > > > generic read mem transaction > > > memory access, level generic' > > > STATUS 9431400100000813 MCGSTATUS 0 > > > MCE 2 > > > CPU 0 2 bus unit TSC 8f8ad2325db7 > > > L2 cache ECC error > > > Bus or cache array error > > > bit46 = corrected ecc error > > > bit62 = error overflow (multiple errors) > > > bus error 'local node origin, request didn't time out > > > prefetch mem transaction > > > memory access, level generic' > > > STATUS d000400000000863 MCGSTATUS 0 > > > > CPU cache is getting ECC errors. Smells like overheating. > > > > Daniel > > > > -- Deedra Waters - Gentoo developer relations, accessibility and infrastructure - dmwaters@gentoo.org Gentoo linux: http://www.gentoo.org -- gentoo-amd64@gentoo.org mailing list