decoding MCE Logs? Possible hardware issue?
Derek Atkins
derek-CrUh67yIh4IAvxtiuMwx3w at public.gmane.org
Wed Sep 29 10:44:39 EDT 2010
On Wed, September 29, 2010 10:10 am, Jerry Feldman wrote:
> On 09/29/2010 09:29 AM, Derek Atkins wrote:
>> Jerry Feldman <gaf-mNDKBlG2WHs at public.gmane.org> writes:
>>
>>
>>>> But I suspect there's still really a hardware problem somewhere. :(
>>>>
>>>>
>>>>
>>> Just one thing to add. I have a number of servers with Supermicro
>>> boards, and one of them won't boot unless I blacklist one of the edac
>>> modules. That system has 64GB ECC memory and either 1 or 2 Intel Xeon
>>> CPUs (One of my systems only has 1 CPU the rest have 2). If you are
>>> interested I can email you with the modules I am blacklisting.
>>>
>> Note that this is a Supermicro with AMD CPUs. It only has 16GB RAM
>> right now, but I might extend that if I find that some of the RAM is
>> bad. The system boots just fine, and I do not have any edac modules
>> loaded at all (according to lsmod). So I'm not sure what blacklisting
>> it would accomplish?
>>
>> -derek
>>
>>
> I've got 5 systems with Supermicro X7DB8+ Mother Boards, and only one
> has problems with the edac modules. In my search for a solution to the
> udev hang problem I found a lot of pointers to Supermicro boards. I
> don't know why that one has the issue. It certainly is a much different
> issue than you have. My lsmod on another system shows:
> [gaf at boslc05 ~]$ lsmod | grep edac
> i5000_edac 42177 0
> edac_mc 60193 1 i5000_edac
>
> In any case I was just trying to provide some additional information.
Interesting! I wonder if this is an Intel v. AMD thing? Or perhaps a
2.6.27 v. 2.6.34 thing? Or maybe it's an X7DB8+ v. H8DA3-2 thing?
I'd turn off edac if it looked like it was actually loading on my system.
Ahh, the joys of ECC RAM -- harder to tell when the RAM is bad. ;)
-derek
More information about the Discuss
mailing list