How to activate JSON response format with GPT-4 vision model via OpenAI API?

I’m trying to get JSON formatted responses when working with the GPT-4 vision model but running into issues. The documentation mentioned that you can use response_format with {"type": "json_object"} for certain models, but when I try this with the vision model, I get a validation error saying extra fields are not allowed.

Has anyone successfully implemented JSON mode with vision capabilities? Here’s what I’m attempting:

request_headers = {
    "Content-Type": "application/json", 
    "Authorization": f"Bearer {my_api_key}"
}

api_payload = {
    "model": "gpt-4-vision-preview",
    "response_format": {"type": "json_object"},
    "messages": [
        {
            "role": "system",
            "content": "You are an AI assistant that returns structured JSON responses."
        },
        {
            "role": "user", 
            "content": [
                {
                    "type": "text",
                    "text": user_prompt
                },
                {
                    "type": "image_url",
                    "image_url": {
                        "url": f"data:image/png;base64,{encoded_image}"
                    }
                }
            ]
        }
    ],
    "max_tokens": 800
}

api_response = requests.post("https://api.openai.com/v1/chat/completions", headers=request_headers, json=api_payload)
print(api_response.json())

When I remove the response_format parameter, everything works normally. Any ideas on how to properly enable JSON formatting for vision model responses?

The vision model doesn’t support response_format yet, which can be frustrating. Here’s what actually works for me: be super specific in your system message about the JSON structure you want. Instead of just saying ‘return JSON’, clearly outline the keys you need, such as ‘Always respond with valid JSON containing: analysis, confidence, results.’ Additionally, include ‘Respond only in valid JSON format’ at the end of your user prompt. This approach consistently delivers valid JSON without triggering that pesky validation error associated with response_format.

exactly! since gpt-4-vision-preview doesn’t support response_format, you gotta prompt it directly for json. just say “respond in json” and it usually works. hopefully they’ll add proper support soon!