BLU Discuss list archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Discuss] On Btrfs raid and odd-count disks

Subject: [Discuss] On Btrfs raid and odd-count disks
From: blu at nedharvey.com (Edward Ned Harvey (blu))
Date: Sat, 13 Apr 2013 14:21:05 +0000
In-reply-to: <sjmehefredy.fsf@mocana.ihtfp.org>
References: <5164C9F3.4040802@gmail.com> <e511dc18cf96ddb091411f8f6463f405.squirrel@mail.mohawksoft.com> <D1B1A95FBDCF7341AC8EB0A97FCCC4773BBC8927@SN2PRD0410MB372.namprd04.prod.outlook.com> <sjmwqsatoqi.fsf@mocana.ihtfp.org> <51658526.2060608@gmail.com> <sjm7gkatlyp.fsf@mocana.ihtfp.org> <51659281.6060409@gmail.com> <5165B3FD.7070605@gmail.com> <sjmk3o9rybh.fsf@mocana.ihtfp.org> <5166D866.3080402@gmail.com> <sjmehefredy.fsf@mocana.ihtfp.org>

> From: discuss-bounces+blu=nedharvey.com at blu.org [mailto:discuss-
> bounces+blu=nedharvey.com at blu.org] On Behalf Of Derek Atkins
> 
> > ZFS prevents write holes by enforcing atomicity of all writes to
> > storage. It does this by controlling all of the I/O caching involved in
> > the write process from system RAM down to the write acceleration cache
> > on the disks themselves. ZFS updates the file system only after all
> > cache points have confirmed being flushed.
> >
> > If any of these points lie about their status then write holes can
> > appear under power fault conditions. 

True, but at least, with ZFS & BTRFS, any subsequent read of corrupt data will be detected as a result of cksums.

Also, since we're talking about redundant storage, ZFS (and presumably BTRFS, cuz it's obvious.) will attempt to correct the error.  If a single disk (or a number smaller than your redundancy protection level) wrote corrupt data (or no data) then the cksum fails, and the FS will try all possible combinations of eliminating devices and re-reading, to identify which device(s) contains corrupt data, and if it finds some combination that produces a good cksum, it will attempt to re-write the data to whichever disk(s) failed.


> Fair enough...  I don't know if standard (e.g. DM-level) RAID5 or RAID6
> provide for said "scrubbing"?  

Nope.
Scrubbing is only possible thanks to cksum'ing at the raid level.  Without that, your raid is dependent on the underlying devices to correctly report errors.  But if an error isn't noticed by hardware and escalated to the OS, then the error passes standard raid undetected.

How often does that happen?  Well, in my experience, heavy usage on several TB of enterprise-sata hardware produces a bit error about once every 1-2 years, as identified by the zfs cksum counter incrementing, without the hard drive error counter incrementing.  This means the error passed the drive undetected, and was identified and corrected by ZFS.

References:
- [Discuss] On Btrfs raid and odd-count disks
  - From: richard.pieri at gmail.com (Richard Pieri)
- [Discuss] On Btrfs raid and odd-count disks
  - From: markw at mohawksoft.com (markw at mohawksoft.com)
- [Discuss] On Btrfs raid and odd-count disks
  - From: blu at nedharvey.com (Edward Ned Harvey (blu))
- [Discuss] On Btrfs raid and odd-count disks
  - From: warlord at MIT.EDU (Derek Atkins)
- [Discuss] On Btrfs raid and odd-count disks
  - From: richard.pieri at gmail.com (Richard Pieri)
- [Discuss] On Btrfs raid and odd-count disks
  - From: warlord at MIT.EDU (Derek Atkins)
- [Discuss] On Btrfs raid and odd-count disks
  - From: richard.pieri at gmail.com (Richard Pieri)
- [Discuss] On Btrfs raid and odd-count disks
  - From: richard.pieri at gmail.com (Richard Pieri)
- [Discuss] On Btrfs raid and odd-count disks
  - From: warlord at MIT.EDU (Derek Atkins)
- [Discuss] On Btrfs raid and odd-count disks
  - From: richard.pieri at gmail.com (Richard Pieri)
- [Discuss] On Btrfs raid and odd-count disks
  - From: warlord at MIT.EDU (Derek Atkins)

Prev by Date: [Discuss] On Btrfs raid and odd-count disks
Next by Date: [Discuss] modifying Android packages
Previous by thread: [Discuss] On Btrfs raid and odd-count disks
Next by thread: [Discuss] On Btrfs raid and odd-count disks
Index(es):
- Date
- Thread

Boston Linux & Unix / webmaster@blu.org