BLU Discuss list archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Discuss] Crashplan is discontinued
- Subject: [Discuss] Crashplan is discontinued
- From: smallm at sdf.org (Mike Small)
- Date: Fri, 01 Sep 2017 16:35:20 +0000
- In-reply-to: <CAJFsZ=oyJ4B1qp=58+QBYz87Kh8w=huZcnaWgjqiFd2NQp9H1g@mail.gmail.com> (Bill Bogstad's message of "Fri, 1 Sep 2017 01:05:31 -0400")
- References: <mailman.9.1504195205.26755.discuss@blu.org> <5D31FEA5-F6DC-4F45-BD36-352F588E673E@pioneer.ci.net> <20170831193111.GB12801@newtao.randomstring.org> <CAFv2jcZW0fxdYTzvJUehTf9vG=5NA48OHNoDYaFUfYWNuhkFNw@mail.gmail.com> <chxa82fnpkd.fsf@sdf.org> <CAJFsZ=oyJ4B1qp=58+QBYz87Kh8w=huZcnaWgjqiFd2NQp9H1g@mail.gmail.com>
Bill Bogstad <bogstad at pobox.com> writes: > On Thu, Aug 31, 2017 at 10:02 PM, Mike Small <smallm at sdf.org> wrote: > Does git only compare the checksum or does it also look at file size as well? > I would think that comparing file size might make it even harder to > get a collision. > The only duplicate checksum that I've ever seen in practice was on 0 > length files. > Zero length files are, of course, all perfect duplicates of each other... :-) Ah, git plumbing. Not really my specialty, but I think the answer is implied by some of the docs, kind of. I'll add some guess work and if someone knows better he or she can correct me. Zero length file collisions are not an issue in git because the stuff in its store (.git/object/{first two letters of SHA1 hash}/{rest of SHA1 hash} includes both the file contents themselves (blobs - check me in gitglossary(7)) and tree objects which have capture file and directory names and reference the content blobs. Here's some of my .emacs.d/.git/objects contents (not a great use of git I'm finding, btw. I should have done it down at the level where I only have files I treat as my source code as opposed to stuff emacs changes behind my back.): 8613r0:.git$ du -a objects/ | head 4 objects/af/2ef3b97a02a0cdc859c59e4d39d6a7aa01116c 4 objects/af/ef5e0daed0ecdf0d51dcc347149ae2e1f0e998 12 objects/af 4 objects/d7/2834524cad924ea210b41920293a6fcc5d72ff 8 objects/d7 4 objects/17/dc6f4f501ce4ee0f3488d246b825d0c3ad63fe 4 objects/17/62e9cf542a661f15351b3bb2c50e1a1d26a1cd 12 objects/17 4 objects/pack 4 objects/8e/709560a5a09f69f8be7665ad66e3c394620123 ... So if I'm understanding rightly you could have 10 zero length files in git with different names and that's not a problem. You'd have 10 tree objects in the store, i.e. directories and files matching the SHA1 hash involved, perhaps that all reference one blob object with a different SHA1 directory and file name for the contents (or lack thereof). I think so far I don't see an actual compare, necessarily, just it creates these tree objects and creates the blob object. Maybe it overwrites the blob object for each file or maybe it sees it already exists and just references it, I don't know. Kind of doesn't matter except for performance or whatever. Or does it? Let's take the malicious case. You want to get a file into the store that has the same hash as an existing blob file, so that existing references now have your contents instead of the original stuff. So you'd be creating whatever tree object in the store, no hash collision on that, but you'd want your file blob object to overwrite an existing one. Unless my guesswork here is totally off I'm going to say git must simply overwrite a blob file if you succeed in getting a hash collision. If it did a compare to see if a path with the sha1 number was already under .git/objects and didn't bother to write the new contents then a hash collision couldn't be a real vulnerability and there shouldn't have been a thread discussing it. But I could be way off here. If you really want to know probably you want to start by reading gitcore-tutorial(7), gitrepository-layout(7), and maybe the source of git-hash-object or some other plumbing command. Oh wait, git-hash-object I see now is a link to git, so you'd have to read the top of the source which looks at what the execed filename was, assuming I have indeed picked the right command here. The plumbing man pages are pretty thin. Maybe higher level commands are relevant here too. -- Mike Small smallm at sdf.org
- References:
- [Discuss] Crashplan is discontinued
- From: bogstad at pobox.com (Bill Bogstad)
- [Discuss] Crashplan is discontinued
- Prev by Date: [Discuss] Crashplan is discontinued
- Next by Date: [Discuss] Linux on Lenovo P70 -- data corruption
- Previous by thread: [Discuss] Crashplan is discontinued
- Next by thread: [Discuss] Crashplan is discontinued
- Index(es):