I’m building a music recognition app for my class assignment that works similar to Shazam. Right now I can record audio and convert it to base64 format like the API wants, but the song detection always fails. I think there might be an issue with how I’m recording or encoding the audio data.
The RapidAPI documentation says I need base64 encoded raw audio data under 500KB (3-5 seconds is enough). The audio must be 44100Hz, mono channel, signed 16 bit PCM little endian format. No other formats like mp3 or wav work.
Here’s my sound recording class:
override fun startRecording(targetFile: File) {
buildRecorder().apply {
setAudioSource(MediaRecorder.AudioSource.DEFAULT)
setOutputFormat(MediaRecorder.OutputFormat.THREE_GPP)
setAudioEncoder(MediaRecorder.AudioEncoder.AMR_NB)
setAudioChannels(1)
setAudioSamplingRate(44100)
setAudioEncodingBitRate(16 * 44100)
setOutputFile(targetFile.absolutePath)
prepare()
start()
mediaRecorder = this
}
}
override fun stopRecording() {
mediaRecorder?.stop()
mediaRecorder?.reset()
mediaRecorder = null
}
fun getAudioBytes(targetFile: File): ByteArray {
return targetFile.readBytes()
}
This is my recording handler function:
private suspend fun handleAudioCapture(recorder: AudioRecorder, appContext: Context): String {
Log.d("SongDetectionActivity", "Starting audio capture process")
File(appContext.cacheDir, "recorded_audio.3gp").also {
recorder.startRecording(it)
capturedFile = it
}
delay(5000)
recorder.stopRecording()
val rawAudioBytes = recorder.getAudioBytes(capturedFile!!)
val encodedString = encodeToBase64(rawAudioBytes)
return encodedString
}
And my base64 encoding function:
private fun encodeToBase64(rawBytes: ByteArray): String {
return Base64.encodeToString(rawBytes, Base64.DEFAULT)
}
The API connection works fine when I test manually. What am I doing wrong with the audio recording or conversion process?