Home
| Calendar
| Mail Lists
| List Archives
| Desktop SIG
| Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings Linux Cafe | Meeting Notes | Linux Links | Bling | About BLU |
On 04/22/2011 12:00 PM, discuss-request-mNDKBlG2WHs at public.gmane.org wrote: > Message: 15 Date: Fri, 22 Apr 2011 11:53:23 -0400 From: David > Rosenstrauch <darose-prQxUZoa2zOsTnJN9+BGXg at public.gmane.org> Subject: Re: ZFS and block > deduplication To: discuss-mNDKBlG2WHs at public.gmane.org Message-ID: > <4DB1A473.1090701-prQxUZoa2zOsTnJN9+BGXg at public.gmane.org> Content-Type: text/plain; > charset=ISO-8859-1; format=flowed On 04/22/2011 11:41 AM, Mark > Woodward wrote: >> > I have been trying to convince myself that the SHA2/256 hash is >> > sufficient to identify blocks on a file system. Is anyone familiar with >> > this? >> > >> > The theory is that you take a hash value of a block on a disk, and the >> > hash, which is smaller than the actual block, is unique enough that the >> > probability of any two blocks creating the same hash, is actually less >> > than the probability of hardware failure. >> > Given a small enough block size with a small enough set size, I can >> > almost see it as safe enough for backups, but I certainly wouldn't put >> > mission critical data on it. Would you? Tell me how I'm flat out wrong. >> > I need to hear it. > If you read up on the rsync algorithm > (http://cs.anu.edu.au/techreports/1996/TR-CS-96-05.html), he uses a > combination of 2 different checksums to determine block uniqueness. > And, IIRC, even then he still does an additional final check to make > sure that the copied data is correct (and copies again if not). That's rsync, and I tend to agree with their level of paranoia. Take a look at this link: http://blogs.sun.com/bonwick/entry/zfs_dedup
BLU is a member of BostonUserGroups | |
We also thank MIT for the use of their facilities. |