Hey everyone! I’m trying to set up a cool automation with Zapier but I’m stuck. My goal is to create audio files from text input in Slack. Here’s what I want to do:
Type text in Slack
Send it to Google’s Text-to-Speech API
Get the audio as base64 data
Create an MP3 file or HTML with an audio tag
Store it on Google Drive
Post the link back to Slack
I’ve tried a bunch of things:
Using Google Drive actions (can’t change MIME type easily)
Zapier’s code module (limited by import restrictions)
Looking for clever workarounds with other Zapier apps
Nothing seems to work smoothly. I feel like I’m missing something obvious. Any ideas on how to make this happen? I’d really appreciate some help or creative solutions!
I’ve actually tackled a similar automation challenge recently, and I can share what worked for me. Instead of using Google’s Text-to-Speech API directly, I found success with the ElevenLabs API integration in Zapier. It’s surprisingly flexible and produces high-quality audio files.
For storing and sharing, I bypassed Google Drive entirely and opted for Amazon S3. It’s dead simple to set up in Zapier and gives you much more control over file types and access permissions. Plus, you can generate temporary URLs that expire, which is great for security.
The trickiest part was handling the base64 data, but I wrote a small Python script in Zapier’s code module to handle the conversion. It’s not perfect, but it gets the job done.
One last tip: consider using Slack’s built-in file sharing instead of posting links. It creates a smoother user experience overall. Hope this helps point you in the right direction!
Have you considered using Airtable as an intermediary step? I’ve found it to be incredibly versatile for similar workflows. You could set up a table to receive the Slack input, then use Zapier to trigger the text-to-speech conversion and store the resulting audio file’s metadata in Airtable.
From there, you can use Airtable’s built-in automation features or another Zap to handle the file storage in Google Drive. This approach gives you more flexibility and control over the process, especially when dealing with file types and metadata.
As for the audio playback in Slack, you might want to look into using Slack’s Block Kit to create a custom message format with an embedded audio player. It’s a bit more work upfront, but it results in a much cleaner user experience.
Just remember to be mindful of API rate limits and storage quotas when setting this up at scale.
hey, have u tried using webhooks? they can be super useful for this kinda thing. you could set up a webhook that receives the slack message, then use a service like aws lambda to handle the text-to-speech and file creation. then just use zapier to post the link back to slack. it’s a bit more complex but way more flexible!