BLU Discuss list archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reminder -- RAID 5 is not your friend
- Subject: Reminder -- RAID 5 is not your friend
- From: feenberg-fCu/yNAGv6M at public.gmane.org (Daniel Feenberg)
- Date: Mon, 15 Mar 2010 15:41:29 -0400 (EDT)
- In-reply-to: <4B9E792D.1000003-KwkGvOEf1og@public.gmane.org>
- References: <20100311042755.GO14999@tao.merseine.nu> <67A567B1-63E9-45D2-BDB1-2A204BC692AD@gmail.com> <sjmvdd2py1c.fsf@pgpdev.ihtfp.org> <555ADC1F-1652-465D-A9C9-52E682698FE5@gmail.com> <20100311182042.GB14288@dragontoe.org> <EF578668-1DEA-4B3C-8385-E6CC0146E684@gmail.com> <sjmbpetpojp.fsf@pgpdev.ihtfp.org> <56F1E6AB-C0E7-4B90-98C0-A7259E1F325E@gmail.com> <4B9E792D.1000003@borg.org>
On Mon, 15 Mar 2010, Kent Borg wrote: > Richard Pieri wrote: >> And neither is RAID 1. Except when you get lucky. >> >> I had a failure over the weekend. Two mirrored pairs, A1/A2 B1/B2 configuration. A2 and B1 failed simultaneously. > > Sounds like it is *disks* that are not your friend. And, that they hate > you enough that your use of raid isn't enough to save you. > > My conclusions: > > 1. don't run matched disks from the same manufacturer and lot > 2. watch disk temperature > 3. watch smartmon for indications of aging > 4. replace disks before they die > 5. use your replacements as an opportunity to get your pairs staggered > 6. have backups that at minimum are ping-ponged, current, and physically > offline > 7. goto #1... In most cases this is not a case of simultaneous failure due to common disk wear or defects, or power supply events, or controller problems. In most cases of apparent simultaneous failure Disk 2 has a bad sector that has never been written to. Such a sector can remain undisturbed for the life of the disk, or until the RAID software attempts to sync with another disk. When Disk 1 fails (and is noticed by the RAID software) and is replaced the sync starts copying Disk 2 to the new Disk 1 and runs until the bad sector on Disk 2 is encountered, at which point it announces the fact that Disk 2 has failed. But it didn't fail during the sync - it was probably bad from day 1, and if written to would have been remapped transparently to the user and the Raid software. But sync doesn't write before reading. The only good thing is that data can still be read in degraded mode, and copied to another disk. This is why simultaneous failures are so common in practice, even though statistically they should be quite rare. Daniel Feenberg > > > -kb > > _______________________________________________ > Discuss mailing list > Discuss-mNDKBlG2WHs at public.gmane.org > http://lists.blu.org/mailman/listinfo/discuss >
- References:
- Reminder -- RAID 5 is not your friend
- From: dsr-mzpnVDyJpH4k7aNtvndDlA at public.gmane.org (Dan Ritter)
- Reminder -- RAID 5 is not your friend
- From: richard.pieri-Re5JQEeQqe8AvxtiuMwx3w at public.gmane.org (Richard Pieri)
- Reminder -- RAID 5 is not your friend
- From: warlord-DPNOqEs/LNQ at public.gmane.org (Derek Atkins)
- Reminder -- RAID 5 is not your friend
- From: richard.pieri-Re5JQEeQqe8AvxtiuMwx3w at public.gmane.org (Richard Pieri)
- Reminder -- RAID 5 is not your friend
- From: invalid-yPs96gJSFQo51KKgMmcfiw at public.gmane.org (Derek Martin)
- Reminder -- RAID 5 is not your friend
- From: richard.pieri-Re5JQEeQqe8AvxtiuMwx3w at public.gmane.org (Richard Pieri)
- Reminder -- RAID 5 is not your friend
- From: warlord-DPNOqEs/LNQ at public.gmane.org (Derek Atkins)
- Reminder -- RAID 5 is not your friend
- From: richard.pieri-Re5JQEeQqe8AvxtiuMwx3w at public.gmane.org (Richard Pieri)
- Reminder -- RAID 5 is not your friend
- From: kentborg-KwkGvOEf1og at public.gmane.org (Kent Borg)
- Reminder -- RAID 5 is not your friend
- Prev by Date: Reminder -- RAID 5 is not your friend
- Next by Date: Reminder -- RAID 5 is not your friend
- Previous by thread: Reminder -- RAID 5 is not your friend
- Next by thread: Reminder -- RAID 5 is not your friend
- Index(es):