Algorithm and accompanying code should be developed which will scan audio files and mark first appearance of vocals in it.
## Deliverables
Coder should develop an algorithm for scanning audio files and finding vocals in it. Most important piece of information is the start of the first vocal appearance. Certain audio playback systems are using this piece of information, and manual entry of it is very tedious and error prone.
I will supply 10.000 manually edited songs to the coder that takes the task with each song having first vocal appearance marked. I will use further 5000 songs which are also marked for testing of the algorithm. Song selection will be from the greater database and will be made on random, so each set will have mixed kind of songs.
Algorithm will have to have 3 possible outputs for the song scanned:
- no vocals has been found
- vocal has been found and its start is on XXX
- algorithm can't determine if the vocal is present
In order for me to be satisfied with the algorithm, minimum of 90% of the songs should fall in 1st and 2nd category, and only 10% of the songs are allowed to fall in 3rd category.
For 1st and 2nd category, algorithm should have further 90% rate of success.
For reading mp3, mp2, and wav files which will be present in the database I can supply the code for reading audio data in RAW PCM form.
Optimal coder for this task has some background in DSP and has at least one completed project in this area. Also, high bandwidth flat-rate internet connection for audio file transfer is appropriate.
* * *This broadcast message was sent to all bidders on Thursday Jan 29, 2009 7:45:07 PM:
Dear bidder, I prepared a small list of mp3s that have their to-vocal information embedded in form of external XML file. In order to obtain correct millisecond offset from the start of the file which says where vocal starts, add values in and . Files are available as a torrent download, and tracker file can be found here:
[login to view URL]
Thank you, and please allow few more days to determine right bidder for the job. I am also preparing a sandbox for you to test your algorithm, with 10000 pre-tagged files.
## Platform
Windows XP