From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gentoo-user+bounces-150904-garchives=archives.gentoo.org@lists.gentoo.org>
Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80])
	by finch.gentoo.org (Postfix) with ESMTP id 35C1B1381F3
	for <garchives@archives.gentoo.org>; Tue, 24 Sep 2013 10:37:04 +0000 (UTC)
Received: from pigeon.gentoo.org (localhost [127.0.0.1])
	by pigeon.gentoo.org (Postfix) with SMTP id 4B474E0BAE;
	Tue, 24 Sep 2013 10:36:57 +0000 (UTC)
Received: from mail.ramses-pyramidenbau.de (ramses-pyramidenbau.de [78.47.114.252])
	(using TLSv1 with cipher AECDH-AES256-SHA (256/256 bits))
	(No client certificate requested)
	by pigeon.gentoo.org (Postfix) with ESMTPS id 32465E0A44
	for <gentoo-user@lists.gentoo.org>; Tue, 24 Sep 2013 10:36:56 +0000 (UTC)
Received: from [IPv6:2001:638:a01:8063:3a59:f9ff:fee3:794e] (unknown [IPv6:2001:638:a01:8063:3a59:f9ff:fee3:794e])
	(using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits))
	(No client certificate requested)
	by mail.ramses-pyramidenbau.de (Postfix) with ESMTPSA id B6D7C60356
	for <gentoo-user@lists.gentoo.org>; Tue, 24 Sep 2013 12:36:52 +0200 (CEST)
Message-ID: <52416B52.308@ramses-pyramidenbau.de>
Date: Tue, 24 Sep 2013 12:37:06 +0200
From: Ralf Ramsauer <ralf+gentoo@ramses-pyramidenbau.de>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130919 Thunderbird/17.0.9
Precedence: bulk
List-Post: <mailto:gentoo-user@lists.gentoo.org>
List-Help: <mailto:gentoo-user+help@lists.gentoo.org>
List-Unsubscribe: <mailto:gentoo-user+unsubscribe@lists.gentoo.org>
List-Subscribe: <mailto:gentoo-user+subscribe@lists.gentoo.org>
List-Id: Gentoo Linux mail <gentoo-user.gentoo.org>
X-BeenThere: gentoo-user@lists.gentoo.org
Reply-to: gentoo-user@lists.gentoo.org
MIME-Version: 1.0
To: gentoo-user@lists.gentoo.org
Subject: Re: [gentoo-user] [Hardware Error]: MC1 Error: Copyback Parity/Victim
 error.
References: <CAN0CFw3tON8fZ1XvW02J3ZF2p9T-K0f1dJqOmXGHZ9TSSF7utw@mail.gmail.com> <CAEH5T2MdDoyH6GygG6DPBTdK_n=kCOP1xv90BTQf5r0NDNqKhw@mail.gmail.com> <52409F75.8080309@googlemail.com> <5240A367.9040803@ramses-pyramidenbau.de> <5240A7EB.1080305@ramses-pyramidenbau.de> <CAN0CFw0ni01Lkz0ySkm3Vpe4TU8o5_82TFODbDoAs1uYqLM8ag@mail.gmail.com>
In-Reply-To: <CAN0CFw0ni01Lkz0ySkm3Vpe4TU8o5_82TFODbDoAs1uYqLM8ag@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
X-Archives-Salt: 46caaf40-a7e6-4a9b-93b6-b4ac3067ab84
X-Archives-Hash: 810dda64974f1d5df608f8496de0e54e

A friend of mine told me, that AMD also had some trouble concerning TLB
on that architecture (translation lookaside buffer).
Unfortunatelly I have no references for that issue.

I would keep a eye on that error, and if your system must be
highly-available, i would even change hardware.

Regards,

--
Ralf

On 09/24/13 10:01, Grant wrote:
>> I had a deeper look into the kernel sources:
>>
>> Your error message is exactly thrown by
>> static bool k8_mc1_mce(u16 ec, u8 xec)
>>
>> So probably you have a K8 ;-)
>>
>> Have a look at:
>> http://www.redhat.com/archives/rhelv5-list/2007-October/msg00075.html
> I read it, that one sounds like a correctable ECC RAM error.
>
>> It *might* be an error concerning ECC error correction. Did you recently
>> change any hardware?
> No hardware changed in a very long time.
>
>> Could you attach your /proc/cpuinfo?
> Sure, I've attached it.  I'm changing hosts and machines shortly and
> I've only seen this error once so I'm thinking I don't need to take
> action.
>
> - Grant
>
>
>>> I share this opinion.
>>> The message says - even if the error was corrected - that there's
>>> something dramatically wrong with your - i suppose - CPU.
>>> "Corrected error" might imply, that some low-level feature got disabled
>>> in order to prevent furher errors.
>>>
>>> Does this error appear only once at early boot or frequently?
>>>
>>> Regards,
>>> --
>>> Ralf
>>>
>>> On 09/23/13 22:07, Volker Armin Hemmann wrote:
>>>> Am 23.09.2013 20:59, schrieb Paul Hartman:
>>>>> On Mon, Sep 23, 2013 at 1:45 PM, Grant <emailgrant@gmail.com> wrote:
>>>>>> Can anyone tell me how to decipher this which has appeared in dmesg?
>>>>>> Google wasn't very helpful.
>>>>>>
>>>>>> [Hardware Error]: MC1 Error: Copyback Parity/Victim error.
>>>>>> [Hardware Error]: Error Status: Corrected error, no action required.
>>>>>> [Hardware Error]: CPU:3 (10:2:3) MC1_STATUS[-|CE|-|-|-]: 0x9000000000000171
>>>>>> [Hardware Error]: cache level: L1, tx: INSN, mem-tx: EV
>>>>> Looks like machine check error, it detected an error in the L1 cache
>>>>> on your CPU.
>>>>>
>>>>> Since it says "Corrected error, no action required" I would not worry
>>>>> about it. If that makes you feel any better. :)
>>>>>
>>>>>
>>>> since those errors are rare, I would worry about it.