View Single Post

  #13 (permalink)  
Old 04-21-2008
Rahul
 
Posts: n/a
Default Re: MD5 checksums from downloaded pdfs to prevent duplication

Javi <javibarroso@gmail.com> wrote in
news:a0994a7d-8568-4e2e-996e-877be073da6e@59g2000hsb.googlegroups.com:

>
>
> I'd use a application like fdupes (http://packages.debian.org/sid/
> fdupes), with a cron task perhaps
>



fdupes sounds great. I'm starting with that to discover my preexisting
duplicates. Later though, it might be easier to have a MD5sum wrapper
script since I want to test just one specific file against all that I
already have.

Now, if only I knew of a "fuzzy fdupes"! The test would be something
that can say 2 consecutive scanner images of the same page are the
"same"!

--
Rahul
Reply With Quote