From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 16451 invoked from network); 13 May 2004 14:55:45 +0000 Received: from smtp.gentoo.org (128.193.0.39) by eagle.gentoo.oregonstate.edu with DES-CBC3-SHA encrypted SMTP; 13 May 2004 14:55:45 +0000 Received: from lists.gentoo.org ([128.193.0.34] helo=eagle.gentoo.org) by smtp.gentoo.org with esmtp (Exim 4.24) id 1BOHcf-00047c-Kf for arch-gentoo-dev@lists.gentoo.org; Thu, 13 May 2004 14:55:45 +0000 Received: (qmail 19462 invoked by uid 50004); 13 May 2004 14:55:45 +0000 Mailing-List: contact gentoo-dev-help@gentoo.org; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-dev@gentoo.org Received: (qmail 840 invoked from network); 13 May 2004 14:55:45 +0000 Message-ID: <40A38E86.1090609@gentoo.org> Date: Thu, 13 May 2004 16:04:38 +0100 From: Daniel Drake User-Agent: Mozilla Thunderbird 0.6 (X11/20040506) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Kevin Cc: Gentoo Dev References: <793F9D20-A427-11D8-AC04-0003939E069A@mac.com> <200405130706.12534.gentoo-dev@gnosys.biz> In-Reply-To: <200405130706.12534.gentoo-dev@gnosys.biz> X-Enigmail-Version: 0.83.6.0 X-Enigmail-Supports: pgp-inline, pgp-mime Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [gentoo-dev] Major MCE problem with SMP on Gentoo kernels X-Archives-Salt: 0b4a84e5-b362-4466-9bfd-607f892a8315 X-Archives-Hash: 72e7e4edd334a2e962f183d0c68c2f9d Hi Kevin, Kevin wrote: > Greg KH thinks it's bad memory, but I'm skeptical of that because the main > address that fails (some 30 times in a row) is at 1023.8MB and the Dell > Utilities only test up to 1022MB, and because I haven't seen the problem > with the liveCD kernel. Although I've very rarely dealt with SMP systems, I've seen many unstable systems being diagnosed by various memory testing utilites as OK. As soon as you run memtest, errors come up, and replacing the faulty memory amazingly brings system stability again. If you RAM is always producing errors in the same place (and only in 1 place) then you might want to google for BadMem/BadRAM. These are two flavours of kernel patches which allow you to ask the kernel to ignore specific blocks of memory. You can even get memtest-x86 to output the exact parameters you need based on memory faults it finds. This should allow you to ignore the faulty part of the memory and continue on with the remaining ~1020mb or so. Daniel -- gentoo-dev@gentoo.org mailing list