mcelog reports AMD DRAM Parity Error?

Derek Atkins warlord-DPNOqEs/LNQ at public.gmane.org
Tue Nov 23 09:26:15 EST 2010


Jarod Wilson <jarod-ajLrJawYSntWk0Htik3J/w at public.gmane.org> writes:

> On Nov 19, 2010, at 10:10 AM, Derek Atkins wrote:
>
>> Jarod Wilson <jarod-ajLrJawYSntWk0Htik3J/w at public.gmane.org> writes:
>> 
>>> On Nov 18, 2010, at 10:30 AM, Derek Atkins wrote:
> ...
>>>> Does this mean I have a busted CPU?  Or busted RAM?
>>> 
>>> RAM. However, its not a fatal error, its simply a corrected
>>> ecc error. I'm told this is all a single event here, and the
>>> event was the corrected ecc error, anyway. So you might want
>>> to replace some memory at some point, but hey, its ecc memory
>>> doing what its designed to do here.
>> 
>> Is there an easy way to figure out which bank of RAM had the error?
>> 
>> I guess I can wait until I have another issue..
>
> Its a mixed bag. For some boards, its quite simple, others, well,
> notsomuch... I'm particularly unsure how to do it with mcelog,
> but at least w/edac, there's an edac-utils userspace that can,
> among other things, upload an address/bank/whatever to slot
> mapping for specific motherboards...

In my case it's a SuperMicro H8DA3-2 with two Quad-Core AMD Opteron(tm)
Processor 2378 CPUs.  Would edac work here?

It looks like I have not received a new mcelog entry..  Either that or I
somehow disabled it a while ago and the mcelog upgrade didn't re-enable
what I did.  (Of course I don't remember what I did, and didn't log
it..)  *sigh*

-derek

-- 
       Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
       Member, MIT Student Information Processing Board  (SIPB)
       URL: http://web.mit.edu/warlord/    PP-ASEL-IA     N1NWH
       warlord-DPNOqEs/LNQ at public.gmane.org                        PGP key available





More information about the Discuss mailing list