Home
| Calendar
| Mail Lists
| List Archives
| Desktop SIG
| Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings Linux Cafe | Meeting Notes | Linux Links | Bling | About BLU |
> From: Mark Woodward [mailto:markw-FJ05HQ0HCKaWd6l5hS35sQ at public.gmane.org] > > You know, I've read the same math and I've worked it out myself. I agree it > sounds so astronomical as to be unrealistic to even imagine it, but no matter > how astronomical the odds, someone usually wins the lottery. > > I'm just trying to assure myself that there isn't some probability calculation > missing. I guess my gut is telling me this is too easy. > We're missing something. See - You're overlooking my first point. The cost of enabling verification is so darn near zero, that you should simply enable verification for the sake of not having to justify your decision to anybody (including yourself, if you're not feeling comfortable.) Actually, there are two assumptions being made: (1) We're assuming sha256 is an ideally distributed hash function. Nobody can prove that it's not - so we assume it is - but nobody can prove that it is either. If the hash distribution turns out to be imbalanced, for example if there's a higher probability of certain hashes than other hashes... Then that would increase the probability of hash collision. (2) We're assuming the data in question is not being maliciously formed for the purposes of causing a hash collision. I think this is a safe assumption, because in the event of a collision, you would have two different pieces of data that are assumed to be identical and therefore one of them is thrown away... And personally I can accept the consequence of discarding data if someone's intentionally trying to break my filesystem maliciously. > Besides, personally, I'm looking at 16K blocks which increases the probability > a bit. You seem to have that backward - First of all the default block size is (up to) 128k... and the smaller the blocksize of the filesystem, the higher the number of blocks and therefore the higher the probability of collision. If for example you had 1Tb of data, broken up into 1M blocks, then you would have a total number of 2^20 blocks. But if you broke it up into 1K blocks, then your block count would be 2^30. With a higher number of blocks being hashed, you get a higher probability of hash collision.
BLU is a member of BostonUserGroups | |
We also thank MIT for the use of their facilities. |