I need help creating a smart assistant using Azure cloud platform. The idea is to let users talk or type their questions, and then have the AI figure out what they want and call the right API automatically.
What I’m trying to build:
Users can speak or chat with the assistant
AI understands requests like “Get my recent purchases from last 3 months”
System picks the right API to call (like inventory, customers, invoices, etc.)
Returns data in an easy to read format
My current environment:
Several REST APIs running on Azure:
inventory
customers
invoices
analytics
reports
Everything goes through Azure API Management
APIs have Swagger/OpenAPI specs
Main challenges I’m facing:
Should I go with Azure AI Studio for this?
How can I connect my existing APIs to the AI agent?
What’s the recommended approach for understanding user intentions and extracting key info like dates?
Are there any reference architectures or example projects I can follow?
I’m looking for practical advice on how to wire this all together. Code examples would be really helpful too!
I did this exact same thing 6 months ago - had to connect our financial APIs to a voice interface. Tried a bunch of different ways, but what worked was Azure OpenAI with function calling plus Azure Speech Services. The game-changer was GPT-4’s function calling instead of building intent parsing from scratch. You describe your APIs as functions in the system prompt, and GPT-4 figures out which function to call based on what the user says. Since you’ve got OpenAPI specs already, you can convert those to function definitions automatically. So for “recent purchases,” the AI pulls out the timeframe and hits the right API calls. Conversations feel way more natural because GPT-4 handles context and follow-ups without extra training. Hooking into your existing Azure API Management is easy - just add auth headers and route through your current gateway. Speech-to-text works great for business terms once you tune the custom vocabulary. Took me 3-4 weeks total including testing, which beat the hell out of the LUIS approach we looked at first. Barely any maintenance since you don’t retrain models every time you add APIs.
Azure AI Studio is your best bet. Built something nearly identical for our internal tools 8 months ago and hit the same decisions you’re facing.
The key is Azure OpenAI’s function calling inside AI Studio. Feed it your existing Swagger specs and it automatically knows what APIs you have. Someone says “get my purchases from last 3 months” and the model picks the right function, extracts date ranges - done.
What worked for us:
Import OpenAPI specs into AI Studio as custom functions
Write a system prompt explaining what each API does
Use chat completions API with function calling enabled
Connect through your existing API Management for auth/routing
Zero training needed. GPT-4 already gets business language. We handle inventory queries, customer lookups, reports - all through natural conversation.
For speech input, just add Azure Speech Services on the frontend. Maybe 20 lines of JavaScript.
Biggest gotcha: rate limits when chaining multiple API calls. Batch requests when you can.
This video walks through the whole AI Studio setup. Way clearer than Microsoft’s docs.
Took me about 2 weeks including testing. Most of that was just dialing in the system prompts.
azure cognitive services + function apps is prbly overkill for this. id go with azure openai service and fn calling instead - much simpler setup. just define your apis as functions in the prompt and gpt handles the intent parsing automatically. works great with existing swagger specs too.
Been down this exact path building something similar for our customer support team. Azure AI Studio gets complex fast - way too many services to wire together.
After trying Azure, I realized you’re managing too many moving parts: speech services, language understanding, function calling, API management, plus all the glue code to connect everything.
Switched to Latenode for intelligent API orchestration and it’s been a game changer. You build the entire flow visually:
Connect speech-to-text directly
Use AI nodes to parse intent and extract parameters like dates
Add conditional logic to route to the right API
Transform responses into readable formats
Handle follow-up questions in the same flow
Best part? Import your existing APIs using OpenAPI specs and Latenode automatically creates connection nodes. No rewrites needed.
For your “recent purchases” example: Speech input → AI intent parsing → Date extraction → Customer API call → Invoice API call → Format response. All visual, no complex code.
Had our assistant running in 2 days instead of weeks with Azure. Way easier to modify when requirements change too.
Azure Bot Framework with LUIS is perfect for this. Built almost the same thing for our procurement system last year - worked like a charm. Here’s how it works: Bot Framework manages conversations, LUIS pulls intents and entities from what users type, then Azure Functions call your APIs through API Management. Since you’ve got OpenAPI specs already, you can auto-generate the function bindings. For your “recent purchases” example - LUIS grabs the intent (getPurchases) and timeframe (3 months), triggers the right function to hit your customer and invoice APIs. Results come back through adaptive cards so they look clean. Biggest win? Built-in conversation state. Users can ask follow-ups like “show me order #123 details” and the bot remembers context without extra work on your end. LUIS training’s maybe a day if you’ve got decent sample phrases. Whole thing usually takes about a week including testing. Way more solid than trying to parse intents through generic AI services.