Maid / Guides / Ollama

Connect to Ollama over Your Network

Use Maid as a mobile chat interface for an Ollama instance running on your desktop or home server. This lets you run larger, more capable models on real hardware while keeping your phone in your pocket.

What is Ollama?

Ollama is an open-source tool that makes it easy to download and run large language models on macOS, Linux, and Windows. It manages model downloads, handles GGUF and other formats, and exposes a simple REST API onhttp://localhost:11434 that Maid can talk to.

The key advantage over Maid's built-in llama.cpp provider is hardware: a desktop CPU or GPU can run much larger models (7B, 13B, 70B+ parameters) that would never fit in an Android phone's memory. Your phone acts purely as the chat front-end — input and output only, with all the heavy inference happening on the server.

Your phone and the Ollama host must be on the same local network (e.g. the same Wi-Fi) for this to work.

Step 1 — Install Ollama on your computer

Download and install Ollama on the machine that will act as the server. Official installers are available for macOS, Linux, and Windows at ollama.com.

Once installed, pull a model to get started. For example, to download Llama 3.2 (3B):

ollama pull llama3.2

You can pull any model from the Ollama library. Larger models like Llama 3.1 70B or Mistral Large will give much better results on a capable desktop than anything that can run locally on a phone.

Step 2 — Allow remote connections

By default Ollama only accepts connections from the same machine (localhost). To let Maid on your phone connect to it, you need to tell Ollama to listen on all network interfaces.

On Linux / macOS (terminal):

OLLAMA_HOST=0.0.0.0 ollama serve

On Windows (PowerShell):

$env:OLLAMA_HOST="0.0.0.0"; ollama serve

You also need to make sure your firewall allows inbound TCP connections on port 11434 from your local network. On most home networks this is the only step required. On corporate or managed networks, additional firewall rules may be needed.

If you want Ollama to always start with remote access enabled, set OLLAMA_HOST=0.0.0.0 as a permanent environment variable in your system settings or shell profile.

Step 3 — Connect from Maid

With Ollama running and accepting remote connections, open Maid and configure it to connect.

In Maid, go to Settings and select Ollama from the API dropdown.

Option A — Automatic: Tap Find Ollama. Maid scans every host on your local subnet at port 11434 simultaneously. The first responding host is set as the base URL. An alert appears if nothing is found.

Option B — Manual: Enter the server's IP address in the Base URL field directly. Example:http://192.168.1.100:11434You can find the server's local IP with ip addr (Linux), ifconfig (macOS), or ipconfig (Windows).

Once the base URL is set, Maid fetches the list of available models from your Ollama server and populates the Model dropdown automatically. Select the model you want to use.

Return to the chat screen. Your messages are sent to Ollama over your local network and responses are streamed back in real time.

Configuration reference

Field	Required	Notes
Base URL	Yes	e.g. http://192.168.1.100:11434
Model	Yes	Auto-populated from your Ollama instance
Custom Headers	No	Useful for proxies requiring authentication
Parameters	No	temperature, top_p, top_k, max_tokens, etc.

Troubleshooting

Find Ollama finds nothing

Confirm Ollama is running with OLLAMA_HOST=0.0.0.0.
Check that the firewall on the server allows port 11434.
Make sure both devices are on the same Wi-Fi network (not guest vs. main).

Model dropdown is empty after setting the Base URL

Run ollama list on the server to confirm you have models pulled.
Pull at least one model with ollama pull <model-name>.

Slow responses

The bottleneck is almost always the server, not the network. Try a smaller model or one that fits fully in your GPU VRAM.
Check server CPU/GPU utilization with htop or your GPU monitoring tool.

← All guides Run locally with llama.cpp