Boston Linux & Unix (BLU) Home | Calendar | Mail Lists | List Archives | Desktop SIG | Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings
Linux Cafe | Meeting Notes | Linux Links | Bling | About BLU

BLU Discuss list archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

ECC vs Standard PC100/133 DIMMS



On Fri, 21 Apr 2000, Mark J. Dulcey wrote:

> On Thu, 20 Apr 2000, Randall Hofland wrote:
> 
> > Any thoughts on using ECC memory instead of standard PC100 or PC133
> > DIMMS.
> 
> I have always been a fan of using ECC memory in any system running a
> reliable operating system (i.e., one where you expect to measure the
> uptime in months or years). It potentially adds a bit more system
> stability. Memory errors may be rare, but they DO happen - my Alpha Linux
> box has in fact survived one corrected error, which showed up on the
> console. (Unfortunately, I don't think that Linux on X86 platforms manages
> to report error correction; the errors are just silently fixed.)

COLOSSUS.BILOW.COM is a fancy server machine in continuous use with ECC
DIMMs and dual CPUs, and it has never logged a single RAM error in
operation.  The system does, in fact, report errors when we tested this.
In theory, ECC seems like a good idea, but in practice it is probably a
waste of money.  Of course, this has to be weighed against the cost of the
machine making, say, a database error which could go undectected for an
indefinite period of time.

> Of course, it does cost extra, and you have to have a system board that
> does ECC checking. Some of the low-end chipsets don't, or have speed
> limitations if ECC is enabled. (For example, the ALI Aladdin V chipset
> used on some Super Socket 7 motherboards does ECC, but only at PC66 speed;
> you have to turn it off if you use PC100 SDRAM at full speed.) So you have
> to decide whether the (probably slight) improvement in reliability is
> worth the expense.

The majority of motherboards in the x86 world will not make use of ECC
capability even if you spend the money for ECC RAM.  Only very high-end
boards, usually intended as server platforms, bother with it at all.  One
of the problems is that the L2 cache on most machines has no ECC or even
parity, and this is the point at which the most critical failures are
likely to occur.  As far as I know, no Intel processor, including even the
Xeon, has no ECC or parity in the L1 cache.

-- Mike


-
Subcription/unsubscription/info requests: send e-mail with
"subscribe", "unsubscribe", or "info" on the first line of the
message body to discuss-request at blu.org (Subject line is ignored).




BLU is a member of BostonUserGroups
BLU is a member of BostonUserGroups
We also thank MIT for the use of their facilities.

Valid HTML 4.01! Valid CSS!



Boston Linux & Unix / webmaster@blu.org