Issue with Google Speech to Text API for audio files exceeding one minute

I’m attempting to transcribe an audio file with specific characteristics using the Google Speech to Text API. Here are the file details:

  1. Format: Raw Audio
  2. Sample Rate: 16000 Hz
  3. Bit Depth: 16
  4. Channel: Mono

I implemented the following Python code to generate the text:

request = api.speech().async_recognize(
    data={
        'configuration': {
            'format': 'LINEAR16',  # raw 16-bit signed little-endian samples
            'sample_rate_hertz': 16000,  # 16 kHz
            'language': 'en-US',  # BCP-47 language tag
        },
        'recording': {
            'uri':'gs://somebucket/audio.raw'
            }
        })
result = request.execute()
print(json.dumps(result))

The code seems to function correctly, but it appears that the transcription only processes one minute of audio, disregarding any additional content. Can anyone explain this issue?

I encountered a similar problem where longer audio files were getting cut short. The crux of the issue might relate to the specific limits on audio duration enforced by the API. Google’s Speech-to-Text API imposes certain restrictions on maximum audio lengths for synchronous and asynchronous requests. While synchronous requests have a short duration limit, asynchronous requests should handle files longer than a minute without trimming. Make sure you are really using asynchronous processing. Moreover, check if your bucket permissions or a temporary network blip could be affecting the complete upload of the audio file.