BLU Discuss list archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Help with finding duplicate photos

Subject: Re: Help with finding duplicate photos
From: "Tom Haskins-Vaughan" <user=1517@nabble.blu.org>
Date: Fri Feb 22 09:10:11 2008

 So, I've been thinking, and because I'm no good at the command line, but 
I can hold my own with mysql, I'm going to populate a mysql table with 
the filename (+path) and the md5 checksum of that file. Then I'll run 
queries on the table. Watch this space... 

Tom Haskins-Vaughan wrote: 
> Thanks guys, I'll have a look and let you know how I get on. 
> 
> Tom Metro wrote: 
>> David Kramer wrote: 
>>> Tom Haskins-Vaughan wrote: 
>>>> I have a directory, /home/photos and in that folder are lots and 
>>>> lots of photos in many different subfolders. 
>>> 
>>> sum /home/photos/* | sort 
>>> 
>>> Wherever you see the same number at the beginning of two consecutive 
>>> lines, you have a match. 
>> 
>> Good idea, but the op mentioned that they're in sub folders, and sum 
>> won't traverse the directory tree. You can however use 'find' to do 
>> that, and then post-process the output with 'cut', 'sort', and 'uniq' 
>> to  report only the files that are identical. 
>> 
>> But I'd probably just write a small Perl program to do it using 
>> File::Find, and Digest::MD5, or Perl's built-in checksum capability. 
>> 
>> This is a common problem, so if you're not up to scripting a solution, 
>> check Freshmeat.net. There's probably one specifically designed for 
>> finding duplicate images. 
>> 
>>  -Tom 
>> 
>

Prev by Date: Re: Help with finding duplicate photos
Next by Date: Re: IMAP email providers
Previous by thread: Re: Help with finding duplicate photos
Next by thread: Re: Help with finding duplicate photos
Index(es):
- Date
- Thread

Boston Linux & Unix / webmaster@blu.org