I’m building a music recognition app for a class assignment that works similar to Shazam. I can record audio and convert it to base64 format like the API documentation requires, but the song detection always fails. I think there might be an issue with how I’m recording or encoding the audio data.
The RapidAPI documentation says I need:
Base64 encoded byte array from raw audio data under 500KB (3-5 second clips work best). The audio must be 44100Hz sample rate, mono channel, signed 16-bit PCM little endian format. No other formats like mp3 or wav are accepted.
Here’s my sound recording class:
override fun startCapture(targetFile: File) {
buildRecorder().apply {
setAudioSource(MediaRecorder.AudioSource.DEFAULT)
setOutputFormat(MediaRecorder.OutputFormat.THREE_GPP)
setAudioEncoder(MediaRecorder.AudioEncoder.AMR_NB)
setAudioChannels(1)
setAudioSamplingRate(44100)
setAudioEncodingBitRate(16 * 44100)
setOutputFile(targetFile.absolutePath)
prepare()
start()
mediaRecorder = this
}
}
override fun stopCapture() {
mediaRecorder?.stop()
mediaRecorder?.reset()
mediaRecorder = null
}
fun extractAudioBytes(audioFile: File): ByteArray {
return audioFile.readBytes()
}
My recording function:
private suspend fun captureAudioSample(recorder: SoundRecorder, ctx: Context): String {
File(ctx.cacheDir, "sample.3gp").also {
recorder.startCapture(it)
recordedFile = it
}
delay(5000)
recorder.stopCapture()
val rawData = recorder.extractAudioBytes(recordedFile!!)
val encodedData = encodeToBase64(rawData)
return encodedData
}
Base64 conversion:
private fun encodeToBase64(rawAudio: ByteArray): String {
return Base64.encodeToString(rawAudio, Base64.DEFAULT)
}
What am I doing wrong with the audio format or encoding process?