I’m trying to get JSON formatted responses when working with the GPT-4 vision model but running into issues. The documentation mentioned that you can use response_format
with {"type": "json_object"}
for certain models, but when I try this with the vision model, I get a validation error saying extra fields are not allowed.
Has anyone successfully implemented JSON mode with vision capabilities? Here’s what I’m attempting:
request_headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {my_api_key}"
}
api_payload = {
"model": "gpt-4-vision-preview",
"response_format": {"type": "json_object"},
"messages": [
{
"role": "system",
"content": "You are an AI assistant that returns structured JSON responses."
},
{
"role": "user",
"content": [
{
"type": "text",
"text": user_prompt
},
{
"type": "image_url",
"image_url": {
"url": f"data:image/png;base64,{encoded_image}"
}
}
]
}
],
"max_tokens": 800
}
api_response = requests.post("https://api.openai.com/v1/chat/completions", headers=request_headers, json=api_payload)
print(api_response.json())
When I remove the response_format
parameter, everything works normally. Any ideas on how to properly enable JSON formatting for vision model responses?