Advertisement

Large number of mp3 files - checking for copies?

Started by December 24, 2009 07:19 PM
4 comments, last by Servant of the Lord 14 years, 10 months ago
I recently got 120gbs of (legal) (non-music =(...) mp3 files from multiple sources. I'm 100% confident alot (probably about a third of it) of the 120gbs is redundant data. I'm also 100% confident I don't want to manually go through it looking for copies/clones. Any suggestions for how to tackle this? I'm hoping for some program that can, given a folder, look through each subfolder and compare files, ignoring the filenames, comparing, say, the first 5 seconds of each file.
If they are exactly the same data, you could just write a script to list the files sorted by MD5 hashes (or whatever) as this would make it obvious which ones were redundant. Obviously you could make a slightly more complicated script too that wouldn't require you to even look at the list.

If they are not completely identical (same sound, different bitrate, or whatever), then you have a challenge.
-~-The Cow of Darkness-~-
Advertisement
http://www.lmgtfy.com/?q=mp3+dupe+finder
If on windows, find the root folder of the music, and right click and select "search". Search for all files named *.mp3, including subfolder search. When finished, sort by size. for any 2 or more files of identical size, listen to both, they are probably the same, and delete one of them.
Quote: Original post by Dunge
http://www.lmgtfy.com/?q=mp3+dupe+finder


I have some advice for you. This sort of thing is obnoxious, particularly when the results are not actually useful (notice that at least the first five are either not freeware or not safe). If you're going to waste someone's time, you should probably at least be nice about it.
-~-The Cow of Darkness-~-
@AndreTheGiant:
Heheh, that's a pretty clever idea. Thank you for that, it works great. [smile]

This topic is closed to new replies.

Advertisement