I need to create a cooperative music identification service. Every user will have an option to fingerprint a song and send it to the server with its meta information. At the beginning the service database will be empty and every time a music fingerprint will be received meta data for the song will be updated (the server will assign meta data to a finger print based on majority choise if different user will send different information for the same fingerprint).
I need to calculate a fingerprint for the whole song, I do not need to identify a song from just a fraction.
The fingerprint should not be 100% accurate. I will be happy if two song file will receive the same fingerprint just if the same file is encoded with different compression rate. A low level of noise independence will be a plus.
Silence at the begining or the end of the song will be not a problem, I should remove them using standard silence suppression algorithm (and also in this case a do not need very precise result).
I know there are some opensource library like: http://echoprint.me/ and https://acoustid.org/ but these libraries are excessive for my needs, because if I understood correctly they can identify a song from just a part, and this will create a heavy database. I need an algorithm that will give me a not too heavy (some kb) fingerprint for the whole song.
Which is the simplest and fastest algorithm I can use?
Thanks to all
I suggest you use the AcoustID project. Your description matches this project on a lot of points. Only some of their approaches are different from what you suggest.
Have a look at their mailing list for some better explanations: https://groups.google.com/forum/#!forum/acoustid