Ollama + Open WebUI: Chat Interface for Local LLMs (2026)

Open WebUI can run locally and serve as a slick chat interface for your Ollama-hosted LLMs, letting you interact with models like Llama 3 or Mistral without needing to go to the command line.

Let’s see it in action. First, you need Ollama running. If you don’t have it, grab it from ollama.com. Once installed, pull a model:

ollama pull llama3

Now, to run Open WebUI, you’ll typically use Docker. If you have Docker installed, this command spins up the WebUI, maps its port, and connects it to your Ollama instance (usually running on http://host.docker.internal:11434 from within the container):

docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main

This command does a few key things:

-d: Runs the container in detached mode (in the background).
-p 3000:8080: Maps port 3000 on your host machine to port 8080 inside the container where Open WebUI listens.
--add-host=host.docker.internal:host-gateway: This is crucial for Docker Desktop on Mac and Windows. It creates an alias host.docker.internal that resolves to the host machine’s IP address, allowing the WebUI container to reach your Ollama instance running directly on your host. For Linux, you might need to use your actual host IP.
-v open-webui:/app/backend/data: This creates a Docker volume named open-webui to persist your WebUI data (like user settings and chat history) across container restarts.
--name open-webui: Assigns a name to your container for easier management.
--restart always: Ensures the container restarts automatically if it crashes or your system reboots.
ghcr.io/open-webui/open-webui:main: Specifies the Docker image to use.

After running this, navigate to http://localhost:3000 in your web browser. You’ll see the Open WebUI login screen. The first time, you’ll need to create an admin account. Once logged in, you should see your llama3 model (or any other models you’ve pulled with Ollama) available in the model selection dropdown.

The core problem Open WebUI solves is abstracting away the command-line interaction with LLMs. Instead of typing ollama run llama3, you get a persistent, rich chat interface. You can switch between models, manage them, and even have conversations with multiple AI assistants simultaneously if you’re running multiple models. It handles the API calls to Ollama for you, sending your prompts and receiving the model’s responses, then rendering them in a user-friendly format.

The "chat" experience is more than just sending text. Open WebUI leverages the chat completion API of Ollama. When you send a message, it’s formatted as a list of messages, including roles (user, assistant, system). This allows for more complex interactions, like continuing a conversation thread or providing system-level instructions to the model. For instance, you can set a system prompt for llama3 like "You are a helpful AI assistant that speaks only in haikus." This message is sent along with your user prompt to guide the model’s output.

Open WebUI also supports features like document RAG (Retrieval Augmented Generation) by integrating with document loaders. You can upload PDFs or other documents, and the WebUI will help process them so you can ask questions about their content using your LLM. This is powered by a separate backend service within the WebUI that embeds the document content and performs similarity searches.

One of the most surprising things about Open WebUI is how deeply it integrates with the underlying LLM’s capabilities by simply acting as a sophisticated client for the Ollama API. You can use it to manage model parameters like temperature, top-p, and repetition penalties directly from the UI, affecting how "creative" or "focused" the model’s output will be, without touching Ollama’s configuration files. For example, increasing the temperature from its default of 0.7 to 1.0 will make the model more prone to generating diverse and unexpected responses, while decreasing it to 0.3 will make it more deterministic and focused on the most probable answers. This is all managed via the API calls Open WebUI makes to Ollama.

The next step you’ll likely explore is integrating external tools or custom models, or perhaps setting up user management for multiple people.