Ollama’s structured output feature doesn’t actually enforce JSON; it merely requests it, and the model might still hallucinate non-JSON data.

Here’s how you can get Ollama to reliably spit out JSON, and what’s really going on under the hood.

The Problem

You’re trying to get Ollama to return data in a structured format, specifically JSON, so you can parse it programmatically. You’ve likely tried adding a prompt like "Respond in JSON format" or using the ?format=json parameter, but you’re still getting plain text, or worse, JSON that’s mangled with preamble/postamble text.

The "Why"

Ollama itself is a server that exposes an API to run models. The ?format=json parameter is a hint to the API server to wrap the model’s raw output in a JSON structure. However, the model itself is still a language model; it doesn’t inherently understand strict formatting rules without explicit guidance and, sometimes, specific model training. The model generates text, and the API server tries its best to mold that text into the requested format. If the model deviates, the server can’t magically fix it.

How to Get Reliable JSON

The most robust way to get JSON from Ollama is to combine a strong prompt with the ?format=json parameter.

  1. The Prompt is Key: The model needs to be told exactly what to output and how to format it. Don’t just say "give me JSON." Specify the structure of the JSON you expect.

    Example Prompt:

    You are a helpful assistant that extracts information from text and returns it as a JSON object.
    Given the following text, extract the person's name, age, and occupation.
    Return the result as a JSON object with keys "name", "age", and "occupation".
    
    Text: "Alice is 30 years old and works as a software engineer."
    
  2. Use the ?format=json Parameter: This tells the Ollama API server to wrap the model’s response in a JSON payload. Even if the model’s output is just the JSON string, the API will put it inside a larger JSON object.

    Example API Call (using curl):

    curl http://localhost:11434/api/generate?format=json \
      -H "Content-Type: application/json" \
      -d '{
        "model": "llama3",
        "prompt": "You are a helpful assistant that extracts information from text and returns it as a JSON object.\nGiven the following text, extract the person\'s name, age, and occupation.\nReturn the result as a JSON object with keys \"name\", \"age\", and \"occupation\".\n\nText: \"Alice is 30 years old and works as a software engineer.\"",
        "stream": false
      }'
    
  3. Parse the response Field: When you use ?format=json, the API returns a JSON object. The actual model output is nested within a field named response. You’ll need to extract this nested JSON string and then parse it.

    Example curl Output:

    {
      "model": "llama3",
      "created_at": "2024-05-15T10:30:00.123456Z",
      "response": "{\n  \"name\": \"Alice\",\n  \"age\": 30,\n  \"occupation\": \"software engineer\"\n}",
      "done": true,
      "context": [
        // ... context data ...
      ],
      "total_duration": 1234567890,
      "load_duration": 123456789,
      "prompt_eval_count": 50,
      "prompt_eval_duration": 12345678,
      "eval_count": 100,
      "eval_duration": 123456789
    }
    

    Notice that the actual JSON string is inside the "response" field.

  4. Post-Processing (if needed): If you’re still seeing occasional junk, especially with older or less capable models, you might need a small post-processing step in your client code to clean up the response string before parsing it as JSON. This could involve stripping leading/trailing whitespace or attempting to find the start and end of the JSON object (e.g., finding the first { and the last }).

    Example Python Snippet:

    import json
    import re
    
    raw_api_response = {
      "model": "llama3",
      "created_at": "2024-05-15T10:30:00.123456Z",
      "response": "Some preamble text.\n{\n  \"name\": \"Alice\",\n  \"age\": 30,\n  \"occupation\": \"software engineer\"\n}\nSome trailing text.",
      "done": True,
      # ... other fields
    }
    
    model_output_string = raw_api_response.get("response", "")
    
    # Attempt to find and extract JSON string if model added preamble/postamble
    json_match = re.search(r'\{.*\}', model_output_string, re.DOTALL)
    if json_match:
        json_string = json_match.group(0)
        try:
            structured_data = json.loads(json_string)
            print("Successfully parsed JSON:", structured_data)
        except json.JSONDecodeError:
            print("Could not parse extracted JSON string.")
    else:
        print("No JSON object found in the response.")
    
    

The Surprise: System Prompts and json_mode

While the ?format=json parameter is part of the API, newer models and the Ollama ecosystem are evolving. Some models, like llama3:instruct or mistral:instruct, are fine-tuned for instruction following. For these, you can leverage their inherent understanding of structure.

More importantly, Ollama is experimenting with a more direct json_mode which, when fully implemented and supported by models, will be the true enforcement mechanism. As of recent versions, you might see this as an experimental feature or a parameter that strongly encourages JSON. Keep an eye on Ollama’s documentation for json_mode for the most direct path to enforced JSON in the future.

The json_mode parameter, when it matures, will likely work by internally injecting specific system prompts and potentially using model-level JSON-aware decoding during generation, which is far more robust than just wrapping plain text.

The Next Step

Once you have reliable JSON, the next challenge is handling errors and edge cases in the generated data, such as missing fields or unexpected data types within the JSON itself.

Want structured learning?

Take the full Ollama course →