Boston Linux & UNIX was originally founded in 1994 as part of The Boston Computer Society. We meet on the third Wednesday of each month at the Massachusetts Institute of Technology, in Building E51.

BLU Discuss list archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Discuss] Crashplan is discontinued

Bill Bogstad <bogstad at> writes:

> On Thu, Aug 31, 2017 at 10:02 PM, Mike Small <smallm at> wrote:
> Does git only compare the checksum or does it also look at file size as well?
> I would think that comparing file size might make it even harder to
> get a collision.
> The only duplicate checksum that I've ever seen in practice was on 0
> length files.
> Zero length files are, of course, all perfect duplicates of each other... :-)

Ah, git plumbing. Not really my specialty, but I think the answer is
implied by some of the docs, kind of. I'll add some guess work and if
someone knows better he or she can correct me.

Zero length file collisions are not an issue in git because the stuff in
its store (.git/object/{first two letters of SHA1 hash}/{rest of SHA1
hash} includes both the file contents themselves (blobs - check me in
gitglossary(7)) and tree objects which have capture file and directory
names and reference the content blobs. Here's some of my
.emacs.d/.git/objects contents (not a great use of git I'm finding,
btw. I should have done it down at the level where I only have files I
treat as my source code as opposed to stuff emacs changes behind my

8613r0:.git$ du -a objects/ | head                                                       
4       objects/af/2ef3b97a02a0cdc859c59e4d39d6a7aa01116c
4       objects/af/ef5e0daed0ecdf0d51dcc347149ae2e1f0e998
12      objects/af
4       objects/d7/2834524cad924ea210b41920293a6fcc5d72ff
8       objects/d7
4       objects/17/dc6f4f501ce4ee0f3488d246b825d0c3ad63fe
4       objects/17/62e9cf542a661f15351b3bb2c50e1a1d26a1cd
12      objects/17
4       objects/pack
4       objects/8e/709560a5a09f69f8be7665ad66e3c394620123

So if I'm understanding rightly you could have 10 zero length files in
git with different names and that's not a problem. You'd have 10 tree
objects in the store, i.e. directories and files matching the SHA1 hash
involved, perhaps that all reference one blob object with a different
SHA1 directory and file name for the contents (or lack thereof).

I think so far I don't see an actual compare, necessarily, just it
creates these tree objects and creates the blob object. Maybe it
overwrites the blob object for each file or maybe it sees it already
exists and just references it, I don't know. Kind of doesn't matter
except for performance or whatever. Or does it?

Let's take the malicious case. You want to get a file into the store
that has the same hash as an existing blob file, so that existing
references now have your contents instead of the original stuff. So
you'd be creating whatever tree object in the store, no hash collision
on that, but you'd want your file blob object to overwrite an existing
one. Unless my guesswork here is totally off I'm going to say git must
simply overwrite a blob file if you succeed in getting a hash
collision. If it did a compare to see if a path with the sha1 number was
already under .git/objects and didn't bother to write the new contents
then a hash collision couldn't be a real vulnerability and there
shouldn't have been a thread discussing it.

But I could be way off here. If you really want to know probably you
want to start by reading gitcore-tutorial(7), gitrepository-layout(7),
and maybe the source of git-hash-object or some other plumbing
command. Oh wait, git-hash-object I see now is a link to git, so you'd
have to read the top of the source which looks at what the execed
filename was, assuming I have indeed picked the right command here. The
plumbing man pages are pretty thin. Maybe higher level commands are
relevant here too.

Mike Small
smallm at

BLU is a member of BostonUserGroups
BLU is a member of BostonUserGroups
We also thank MIT for the use of their facilities.

Valid HTML 4.01! Valid CSS!

Boston Linux & Unix /