Boston Linux & Unix (BLU) Home | Calendar | Mail Lists | List Archives | Desktop SIG | Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings
Linux Cafe | Meeting Notes | Blog | Linux Links | Bling | About BLU

BLU Discuss list archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Discuss] Server won't boot kernel. initramfs problem?



Maybe not every 5 minutes the way most things are configured in nagios. But running it once a day, or even once a week, to allow nagios a chance to detect memory errors might be worth the overhead. 

It would be sufficient just to detect that bad RAM exists. You have to power off the server anyway to replace a bad DIMM, so once you know you have bad RAM, you can run memtest86 to figure out the details. 

It would be better than wasting days or weeks replacing hard drives and reinstalling the OS before thinking to test memory. 



On Feb 24, 2013, at 3:16 PM, Bill Bogstad <bogstad at pobox.com> wrote:

> On Sat, Feb 23, 2013 at 12:22 PM, John Abreau <abreauj at gmail.com> wrote:
>> RAM going bad silently is an aggravating problem, and we often don't think
>> to test the RAM when some mysterious error crops up. It would be great if
>> Nagios was able to test RAM automatically.
>> 
>> Is it possible to test RAM on a live system, rather than having to boot
>> into memtest86?
> 
> There is a user space memory tester that tries to use mlock() to avoid paging.
> 
> http://pyropus.ca/software/memtester/
> 
> Some obvious caveats would apply:
> 
> Kernel memory goes untested (whether code, data, or allocated to buffer caches).
> Memory allocated to other processes only gets tested when they get paged out.
> Since it is all virtual addresses,  a failure doesn't tell you where
> the error occurred.
> Likely to trash your system performance while it runs.
> 
> So chances are you probably don't want to use it....
> 
> Bill Bogstad



BLU is a member of BostonUserGroups
BLU is a member of BostonUserGroups
We also thank MIT for the use of their facilities.

Valid HTML 4.01! Valid CSS!



Boston Linux & Unix / webmaster@blu.org