[linux-audio-dev] mp3 mess help

Frank NEUMANN frank.neumann at st.com
Wed Oct 27 09:36:23 UTC 2004


Hi list,
Jens M Andreasen <jens.andreasen at chello.se> wrote:

[..]
> > I have a full partition with them in it but it is obvious most of them
> > are there multiple times with different non discriptive names..ouch:(
> > 
> > 
> > Is there a way I can searh the mp3 to find which are the same/different
> > using the actual mp3 binary data??
> > 
> 
> Short version:
> 
>    1) Sort them (by binary content.)
> 
>    2) Delete duplicates.
> 
>    3) ...
> 
>    4) Profit! :)
> 
> 
> I would probably only keep the path to the mp3 in the sorted structure,
> and then open (and close) them for comparison as needed. 
> 
> 'man qsort' is your friend. 

My suggestion would be like this (if we are really talking about byte-by-byte
identical files):

find <path_to_mp3_directory> -iname "*.mp3" -exec md5sum {} \; | sort >log.txt

This will give you a logfile, "log.txt" containing all files, sorted by md5
checksum, _including_ duplicates. Whenever you see two identical md5 sums
following each other, you have a duplicate. 

There are perhaps more geeky ways to use awk etc now to actually print out
the names of the duplicates, but others will have to continue from here.

Greetings,
Frank



More information about the Linux-audio-dev mailing list