Home
| Calendar
| Mail Lists
| List Archives
| Desktop SIG
| Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings Linux Cafe | Meeting Notes | Linux Links | Bling | About BLU |
So, I've been thinking, and because I'm no good at the command line, but I can hold my own with mysql, I'm going to populate a mysql table with the filename (+path) and the md5 checksum of that file. Then I'll run queries on the table. Watch this space... Tom Haskins-Vaughan wrote: > Thanks guys, I'll have a look and let you know how I get on. > > Tom Metro wrote: >> David Kramer wrote: >>> Tom Haskins-Vaughan wrote: >>>> I have a directory, /home/photos and in that folder are lots and >>>> lots of photos in many different subfolders. >>> >>> sum /home/photos/* | sort >>> >>> Wherever you see the same number at the beginning of two consecutive >>> lines, you have a match. >> >> Good idea, but the op mentioned that they're in sub folders, and sum >> won't traverse the directory tree. You can however use 'find' to do >> that, and then post-process the output with 'cut', 'sort', and 'uniq' >> to report only the files that are identical. >> >> But I'd probably just write a small Perl program to do it using >> File::Find, and Digest::MD5, or Perl's built-in checksum capability. >> >> This is a common problem, so if you're not up to scripting a solution, >> check Freshmeat.net. There's probably one specifically designed for >> finding duplicate images. >> >> -Tom >> >
BLU is a member of BostonUserGroups | |
We also thank MIT for the use of their facilities. |