BLU Discuss list archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Discuss] SSD drives vs. Mechanical drives

Subject: [Discuss] SSD drives vs. Mechanical drives
From: blu at nedharvey.com (Edward Ned Harvey (blu))
Date: Wed, 7 May 2014 13:56:29 +0000
In-reply-to: <CAJFsZ=p22nncGGD3WWxv-PNsOv1chxTBZjmaL-ao465WVazgdg@mail.gmail.com>
References: <5364F3FB.40707@blu.org> <5367AE30.5020205@borg.org> <5367B2A9.3090804@gmail.com> <5367E6E2.7050005@borg.org> <5367EB97.5070501@gmail.com> <536801C9.9030407@borg.org> <697921c8eefc4fde8852a86b2e2f3e12@CO2PR04MB684.namprd04.prod.outlook.com> <CAJFsZ=p22nncGGD3WWxv-PNsOv1chxTBZjmaL-ao465WVazgdg@mail.gmail.com>

> From: Bill Bogstad [mailto:bogstad at pobox.com]
> 
> > Truth is:  Hardware mirroring doesn't provide data integrity.  But software
> mirroring with btrfs/zfs do indeed provide data integrity.
> 
> For purposes of this email:
> 
> data loss: you don't get any data
> data integrity: you get data, but it isn't what you wrote to the storage system
> 
> Mirroring will help prevent data loss, but not help with data integrity.
> (Unless you read both copies and compare at which point you
> have converted a data integrity event into a possible data loss event).

Seriously dude?

The way ZFS and BTRFS behave is as follows:  The filesystem is aware of multiple redundant copies of data.  In normal operation, the filesystem tries to read non-overlapping blocks from multiple devices in parallel to increase performance.  All data is checksummed, so if any of the data read is in fact corrupt, the corruption is detected.  If corruption is detected on any device, then the filesystem reads from alternate redundant copies to retrieve valid data, and it will rewrite the failed device with valid data, increase the cksum failure count for the device, and if the device reports too many cksum failures, then the device is marked bad.

So indeed, software mirroring with btrfs and zfs *does* provide data integrity.

Hardware raid, on the other hand, doesn't do checksumming.  It just does device block mapping.  When the OS requests a particular block of data, there is no way to know which individual device in the raid array actually served up the data.  If there exists corrupt data inside a hardware raid set, then it's possible to keep reading the same block over and over again, getting different data each time.

In software (btrfs and zfs) you should periodically scrub.  In fact, this is something that would be good on *all* raid sets, it's just not available on hardware raid.  What a scrub does is this:  It reads all redundant copies, of all data, on all devices.  Searching for cksum failures anywhere in the storage.  Attempts to correct them as described above.

If you don't do a scrub, it's possible for one side of a mirror to have corrupt data, silently.  By bad luck, you always read from the good device whenever you read that corrupt block, so you never detect the corrupted device.  And then, by bad luck, the good side of the mirror suffers hardware failure.  So only after the hardware failure, do you actually read the corrupt data and discover the corruption.  This could have been avoided, if you performed a scrub while there still existed any redundant copy of good data.

Follow-Ups:
- [Discuss] SSD drives vs. Mechanical drives
  - From: dsr at randomstring.org (Dan Ritter)

References:
- [Discuss] SSD drives vs. Mechanical drives
  - From: gaf at blu.org (Jerry Feldman)
- [Discuss] SSD drives vs. Mechanical drives
  - From: kentborg at borg.org (Kent Borg)
- [Discuss] SSD drives vs. Mechanical drives
  - From: richard.pieri at gmail.com (Richard Pieri)
- [Discuss] SSD drives vs. Mechanical drives
  - From: kentborg at borg.org (Kent Borg)
- [Discuss] SSD drives vs. Mechanical drives
  - From: richard.pieri at gmail.com (Richard Pieri)
- [Discuss] SSD drives vs. Mechanical drives
  - From: kentborg at borg.org (Kent Borg)
- [Discuss] SSD drives vs. Mechanical drives
  - From: blu at nedharvey.com (Edward Ned Harvey (blu))
- [Discuss] SSD drives vs. Mechanical drives
  - From: bogstad at pobox.com (Bill Bogstad)

Prev by Date: [Discuss] SSD drives vs. Mechanical drives
Next by Date: [Discuss] SSD drives vs. Mechanical drives
Previous by thread: [Discuss] SSD drives vs. Mechanical drives
Next by thread: [Discuss] SSD drives vs. Mechanical drives
Index(es):
- Date
- Thread


BLU is a member of BostonUserGroups
We also thank MIT for the use of their facilities.