How can I configure n8n with OpenAI to extract data and format it as structured JSON?

ameliat · August 25, 2025, 12:37pm

I need help setting up an OpenAI node in n8n that can pull out certain details from user messages and give back a properly formatted JSON response.

Let’s say I want to grab things like a person’s email and phone number from their message. I’m looking for output that looks something like this:

{
  "reply_message": "Thanks for reaching out! What's the best way to contact you? 📞",
  "extracted_data": [
    {
      "data_type": "Email",
      "results": [{ "content": "[email protected]" }]
    },
    {
      "data_type": "Phone",
      "results": [{ "content": "555-0123" }]
    }
  ]
}

What kind of prompt should I write to get this working? Also, how do I set up the OpenAI node settings in n8n so the response always comes back in this exact JSON format?

I really need the output to be consistent and easy for other systems to process. Any suggestions would be great!

lily_luminesce · September 4, 2025, 2:13am

n8n gets messy when you’re forcing consistent JSON outputs. Hit this exact problem last year building customer support automation.

The issue isn’t just prompts or temperature - you need a solid pipeline handling the entire flow, not just OpenAI.

Switched to Latenode and it solved everything. Their OpenAI integration lets you define strict JSON schemas right in the node config. No more hoping the AI follows your format.

You set up your exact JSON structure, connect webhooks or forms for user messages, then pipe structured output straight to your database or whatever.

Built-in error handling and retry logic too. When OpenAI returns garbage, Latenode catches it and retries instead of breaking everything.

Moved three n8n automations to Latenode last year - haven’t looked back. JSON extraction just works without manual prompt engineering.

Emma_Fluffy · September 3, 2025, 1:44pm

I’ve worked with similar setups and temperature settings are huge for consistent JSON output. Set it to 0.1 or lower in the OpenAI node - kills the variation in responses. For prompts, be super specific about field names and structure. Something like “Extract contact info and return JSON with exactly these fields: reply_message, extracted_data array with data_type and results objects”. Throw a few example outputs in your system prompt too - OpenAI works way better with concrete examples than vague descriptions. One thing that bit me early on: n8n chokes on malformed JSON sometimes. I added error handling in the next nodes to catch when the AI goes off-script with formatting.

olivias · September 2, 2025, 12:41pm

Been running OpenAI extractions in n8n for months - here’s what saved me tons of headaches: skip the JSON prompts and use function calling instead. Set up your OpenAI node with a custom function that matches your schema exactly. Define parameters for email, phone, reply_message, whatever you need. OpenAI treats this like a function call, not random text generation. Way more reliable than messing with prompts and temperature settings. You’ll still get some parsing errors but way fewer. Stick with gpt-3.5-turbo or gpt-4 since older models suck at function calls. This approach forces the AI to think structured from the start instead of generating text and hoping it fits your format.

CharlieLion22 · August 30, 2025, 10:33pm

hey! don’t forget to set ur OpenAI node to ‘json_object’ response. also throw “always reply in json” into ur prompt - helps keep the output structured properly. good luck with n8n!