dupeGuru Music Edition is what you want. Set the scan type to "Audio Contents" in Preferences. Please note that the program is fairware so please contribute if you can.
I suggest you couple this with MusicBrainz Picard which can tag your music files automatically.
There is a plugin that was made some time ago for this. I've used it recently but it still leaves a little to be desired. There is a "PPA" for it - but no built packages yet, just the Bazaar branch. The install instructions go something like this:
Once it's installed restart Rhythmbox and you should have a Duplicates Finder now in the plugin list.
After activating it - there are additional configuration options available.
After the plugin is enabled - and when it finds duplicates - it'll add an additional option to your library list:
A few settings that I've found as "odd" - I've tried this on a media library with over 120,000 songs (over 1,000 duplicates) and a library with about 1,000 songs and maybe 30 duplicates. On the former it took a VERY long time and crashed Rhythmbox several times during the search. I eventually went with Automatically "Remove from Library" to avoid having to rebuild the list. On smaller libraries everything works great though.
When a duplicate is found - if you have the default options selected - the lower quality version of the song will be added to the list. So it's safe to select all songs on the Duplicates list and "Remove" (Either delete from disk or remove from library).
I ran into a similar issue when I had a bunch of duplicate image files. In my case, I just used md5sum on the files and sorted the results:
for file in $(find $rootdir -name "*.jpg"); do echo $(md5sum $file); done | sort
Files with the same contents generated the same hash, so duplicates could be found easily. I manually deleted the dupes from there, although I could have extended the script to delete all but the first occurrence, but I'm always paranoid about doing that in an ad-hoc script.
Note that this only works for duplicate files with identical contents.
It might be a dozen years late, but I just wrote a command-line program that tries to detect similar audio files by comparing acoustic fingerprints: https://github.com/derat/soundalike
It uses the fpcalc utility from Chromaprint to generate the fingerprints, and then builds a lookup table to find possible matches before comparing fingerprints more rigorously.
dupeGuru Music Edition is what you want. Set the scan type to "Audio Contents" in Preferences. Please note that the program is fairware so please contribute if you can.
I suggest you couple this with MusicBrainz Picard which can tag your music files automatically.
There is a plugin that was made some time ago for this. I've used it recently but it still leaves a little to be desired. There is a "PPA" for it - but no built packages yet, just the Bazaar branch. The install instructions go something like this:
If you're interested in using the Bazaar'd source code do the following instead:
Once it's installed restart Rhythmbox and you should have a Duplicates Finder now in the plugin list.
After activating it - there are additional configuration options available.
After the plugin is enabled - and when it finds duplicates - it'll add an additional option to your library list:
A few settings that I've found as "odd" - I've tried this on a media library with over 120,000 songs (over 1,000 duplicates) and a library with about 1,000 songs and maybe 30 duplicates. On the former it took a VERY long time and crashed Rhythmbox several times during the search. I eventually went with Automatically "Remove from Library" to avoid having to rebuild the list. On smaller libraries everything works great though.
When a duplicate is found - if you have the default options selected - the lower quality version of the song will be added to the list. So it's safe to select all songs on the Duplicates list and "Remove" (Either delete from disk or remove from library).
You can use fdupes for that:
which gives you a list of all duplicate files.
You can easily install it with
I ran into a similar issue when I had a bunch of duplicate image files. In my case, I just used
md5sum
on the files and sorted the results:Files with the same contents generated the same hash, so duplicates could be found easily. I manually deleted the dupes from there, although I could have extended the script to delete all but the first occurrence, but I'm always paranoid about doing that in an ad-hoc script.
Note that this only works for duplicate files with identical contents.
It might be a dozen years late, but I just wrote a command-line program that tries to detect similar audio files by comparing acoustic fingerprints: https://github.com/derat/soundalike
It uses the
fpcalc
utility from Chromaprint to generate the fingerprints, and then builds a lookup table to find possible matches before comparing fingerprints more rigorously.Try FSlint or dupe gredtter
To install FSlint type in terminal (Ctrl-Alt-T)
hope this is useful..
I've used FSlint to find duplicate files in general. FSlint is "a utility to find and clean various forms of lint on a filesystem."