Boston Linux & UNIX was originally founded in 1994 as part of The Boston Computer Society. We meet on the third Wednesday of each month, online, via Jitsi Meet.

BLU Discuss list archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Discuss] Crashplan is discontinued



On Thu, Aug 31, 2017 at 10:02 PM, Mike Small <smallm at sdf.org> wrote:
> John Abreau <abreauj at gmail.com> writes:
>
>> I've heard of tools using MD5 or SHA1 hashes to identify duplicates, and
>> potential issues with hash collisions causing false positives.
>
> By accident or maliciously? The numbers seem off for accidental
> collisions. An md5 sum is a 16 digit hex number. That gives
> 340282366920938463463374607431768211456 potential hash sums (or does the
> algorithm offer only a smaller subset?). I'm not going to bother to
> compute the probability of a collision. It's a very remote possiblity,
> yes? For the malicious case, if someone's able to mess with the hashes
> used by deduplication code in your file system or in your hopefully
> almost as good userland equivalent (which of course must use git in some
> way or another for reasons that are not clear to me) you have unsolvable
> problems.

Does git only compare the checksum or does it also look at file size as well?
I would think that comparing file size might make it even harder to
get a collision.
The only duplicate checksum that I've ever seen in practice was on 0
length files.
Zero length files are, of course, all perfect duplicates of each other... :-)

Bill Bogstad



BLU is a member of BostonUserGroups
BLU is a member of BostonUserGroups
We also thank MIT for the use of their facilities.

Valid HTML 4.01! Valid CSS!



Boston Linux & Unix / webmaster@blu.org