Using structured output with Ollama functions in LangChain Python

I’m trying to build a self-RAG system using LangChain but can’t use OpenAI models. I want to use OllamaFunctions with structured output instead of regular ChatOllama. Here’s my test code:

from langchain_experimental.llms.ollama_functions import OllamaFunctions
from langchain_core.prompts import PromptTemplate
from langchain_core.pydantic_v1 import BaseModel, Field

# Define response schema
class Employee(BaseModel):
    full_name: str = Field(description="Employee's full name", required=True)
    age: int = Field(description="Employee's age", required=True)
    department: str = Field(description="Employee's department")

# Create prompt
template = PromptTemplate.from_template(
    """Sarah works in marketing and is 28 years old.
    Tom is 5 years older than Sarah and works in engineering.
    
    Question: {query}
    Answer: """
)

# Setup model
model = OllamaFunctions(model="phi3", temperature=0)
structured_model = model.with_structured_output(Employee)
pipeline = template | structured_model

First I get a validation error about duplicate ‘format’ argument. After removing the format parameter, I get NotImplementedError when calling with_structured_output(). How can I make this work with OllamaFunctions?

Yeah, with_structured_output() doesn’t work with OllamaFunctions yet. Hit the same wall last month switching from OpenAI to local models. I switched to ChatOllama and handled structured output manually with JSON parsing. Just define your schema in the prompt, tell the model to respond in JSON, then parse it into your Pydantic model. Or try llama-cpp-python with function calling if you really need structured output. OllamaFunctions is still experimental and missing tons of features. I went with the prompt engineering approach for JSON responses - works great with phi3.

You’re hitting this because OllamaFunctions is half-baked for complex structured output. Had the same headaches building our document processing system.

Ditch the manual JSON parsing entirely. You need proper workflow automation that handles structured output conversion without the mess.

I built something similar for our RAG system - needed consistent data extraction from various sources. Instead of fighting LangChain’s experimental stuff, I set up the whole pipeline in Latenode. It connects directly to Ollama models and handles all structured output conversion automatically.

Define your schema once, and Latenode handles prompt formatting, response parsing, and error handling. No more validation errors or NotImplementedError surprises. Built-in retry logic when your local model gives wonky responses.

For Employee extraction, just drag and drop - Ollama connector, schema validator, output formatter. Takes 10 minutes to set up what you’re coding manually.

Check it out: https://latenode.com

Had this exact issue building my RAG pipeline last week. OllamaFunctions wrapper is pretty limited compared to OpenAI’s function calling. What worked for me - use regular ChatOllama with Pydantic’s output parser instead. Create a PydanticOutputParser from your Employee model and inject the format instructions straight into your prompt template. The parser handles JSON validation and converts it to your Pydantic object automatically. Just add retry logic since local models sometimes spit out malformed JSON. I’m using llama3 for this and getting solid results - phi3 should work fine too. Way more reliable than forcing OllamaFunctions to do something it wasn’t built for.

OllamaFunctions is just a wrapper trying to copy OpenAI’s function calling, but it’s missing half the methods. I ran into this while building a knowledge extraction system.

Skip OllamaFunctions completely. Use ChatOllama with JsonOutputParser instead - way cleaner. Just tell phi3 exactly what JSON structure you want in the prompt.

from langchain_community.chat_models import ChatOllama
from langchain.output_parsers import JsonOutputParser
from langchain_core.prompts import PromptTemplate

parser = JsonOutputParser(pydantic_object=Employee)
model = ChatOllama(model="phi3", temperature=0)

template = PromptTemplate(
    template="Sarah works in marketing and is 28 years old...\n\n{format_instructions}\n\nQuestion: {query}",
    input_variables=["query"],
    partial_variables={"format_instructions": parser.get_format_instructions()}
)

chain = template | model | parser

You get proper error handling and retry logic. I’m running this in production with llama3 and it handles malformed responses way better than manual parsing.

This video helped me figure out structured output patterns with Ollama:

Shows you how to set up these patterns without wrestling with experimental wrappers. Much better than forcing OllamaFunctions to do something it can’t.