The most surprising thing about deploying LLMs in air-gapped environments is how little the core LLM technology changes; it’s the delivery mechanism that becomes the intricate puzzle.
Imagine you’ve got a secure, offline network – no internet access, period. You want to run models like Llama 3, Mistral, or Phi-3 on it. Ollama is your go-to tool for running these locally, but how do you get the models and Ollama itself onto that air-gapped system?
Here’s how you’d typically set up Ollama for an air-gapped deployment:
1. Prepare Your "Internet-Connected" Machine
This is where you’ll download everything needed.
-
Install Ollama: On a machine with internet access, download the Ollama installer from ollama.com/download. Run it.
-
Download Models: This is the crucial step. You need to pull the model files. Let’s say you want Llama 3 8B.
ollama pull llama3This command downloads the model weights and configuration. Ollama stores these model files in a specific directory. On Linux, this is typically
~/.ollama/models. On macOS, it’s~/Library/Application Support/Ollama/models. On Windows, it’s%USERPROFILE%\.ollama\models.You can list your downloaded models and their sizes:
ollama listThis will show you something like:
NAME ID SIZE MODIFIED llama3:8b <hash> 4.7 GB 2 days ago mistral:7b <hash> 4.1 GB 3 days ago phi3:mini <hash> 2.7 GB 1 day agoEach model is a directory containing several files, including
config.json,params.json, and the actual weight files (e.g.,layers.0.weight.bin,layers.1.weight.bin, etc.).
2. Transfer to the Air-Gapped Network
-
Copy Ollama Application: The Ollama executable itself needs to be transferred.
- Linux: Find the
ollamabinary (often in/usr/local/bin/ollamaor/usr/bin/ollama). Copy this single file. - macOS: The application bundle is usually in
/Applications/Ollama.app. You can copy this entire directory. - Windows: Locate the
ollama.exefile.
- Linux: Find the
-
Copy Model Files: You need to copy the entire
~/.ollama/modelsdirectory (or its equivalent on your OS) from your internet-connected machine to the air-gapped machine. Ensure the directory structure is preserved. For example, ifllama3:8bwas at~/.ollama/models/llama3/8b/, it must be copied to the equivalent location on the air-gapped machine.
3. Configure and Run on the Air-Gapped Machine
-
Place Ollama Executable: Put the copied
ollamabinary (or theOllama.appdirectory, orollama.exe) into a known location on the air-gapped machine. For Linux, placing it in/usr/local/binis standard. -
Place Model Files: Ensure the copied
modelsdirectory is in the expected location relative to where Ollama will be run, or configure Ollama to look elsewhere. -
Set Environment Variable (Optional but Recommended): To ensure Ollama looks for models in your specific copied directory, you can set the
OLLAMA_MODELSenvironment variable. On Linux/macOS:export OLLAMA_MODELS=/path/to/your/copied/modelsOn Windows (Command Prompt):
set OLLAMA_MODELS=C:\path\to\your\copied\modelsOn Windows (PowerShell):
$env:OLLAMA_MODELS="C:\path\to\your\copied\models"You would typically add this to your shell’s profile file (e.g.,
.bashrc,.zshrcon Linux/macOS, or a startup script on Windows) so it’s set automatically. -
Start Ollama: Run the Ollama server. On Linux/macOS:
./ollama serveOr, if installed in a standard path:
ollama serveOn Windows:
ollama.exe serveOllama will start and automatically detect the models in the specified
OLLAMA_MODELSdirectory. -
Run a Model: You can now interact with the models.
ollama run llama3This command will load the
llama3model from your local files and present you with an interactive prompt.
What makes this work?
Ollama is designed to be a self-contained application. When you ollama pull, it downloads the model architecture definitions and the actual trained weights. When you ollama serve, it reads these local files, loads them into memory (or onto the GPU if available), and exposes an API endpoint (defaulting to http://127.0.0.1:11434) for interaction. By transferring the Ollama binary and the entire model data directory, you’re essentially transplanting the entire execution environment. The OLLAMA_MODELS environment variable is key because it tells the Ollama server where to find those model files, bypassing any default locations that might not exist or be accessible in the air-gapped setup.
The critical insight is that the model files are just data. As long as the Ollama server can access that data and has the necessary libraries (which are bundled with the Ollama executable for most common platforms), it doesn’t need external network access to run the models.
The next hurdle you’ll likely encounter is managing model updates or introducing new models into the air-gapped environment, which requires repeating this transfer process.