Question Overview
I’m working on a project that mimics the functionality of Shazam. I’ve managed to record audio and transform it into a base64 string as required by the API; however, it fails to identify any songs. This issue likely stems from either the recording or decoding process being incorrect. Currently, I am inputting my data directly into the API, eliminating the possibility of connection problems.
API Requirements
The API specifies that I need to provide an encoded base64 string from raw audio data that’s under 500KB. Sample durations of 3-5 seconds are sufficient for song recognition. The required specifications are:
- 44,100Hz sample rate
- Mono channel
- 16-bit signed PCM little endian
Media types like mp3, wav, etc., are not supported.
Audio Recorder Implementation
I’ve built an AudioRecorder
class to capture audio:
override fun initiate(outputFile: File) {
audioRecorder().apply {
setAudioSource(MediaRecorder.AudioSource.MIC)
setOutputFormat(MediaRecorder.OutputFormat.THREE_GPP)
setAudioEncoder(MediaRecorder.AudioEncoder.AMR_NB)
setAudioChannels(1)
setAudioSamplingRate(44100)
setAudioEncodingBitRate(16 * 44100)
setOutputFile(outputFile.absolutePath)
prepare()
start()
recorder = this
}
}
override fun halt() {
recorder?.stop()
recorder?.reset()
recorder = null
}
fun extractAudioData(file: File): ByteArray {
return file.readBytes()
}
I also have a function that triggers the recording at the appropriate moment:
private suspend fun beginAudioRecording(audioRecorder: AudioRecorder, context: Context): String {
Log.d("your.package.name", "Starting audio recording")
// Begin recording
File(context.cacheDir, "recorded_audio.3gp").also {
audioRecorder.initiate(it)
recordedFile = it
}
// Record for 5 seconds
delay(5000)
// End the recording
audioRecorder.halt()
// Convert recorded audio to Base64 string
val recordedData = audioRecorder.extractAudioData(recordedFile!!)
return transformToBase64(recordedData)
}
Base64 Conversion Function
Finally, here is my function for converting to base64:
private fun transformToBase64(audioData: ByteArray): String {
return Base64.encodeToString(audioData, Base64.DEFAULT)
}