The life agentic

Local models and Ollama

If you want to run models on your own machine instead of depending on GitHub, Claude, or a direct API provider, Ollama is the main route worth learning early.

This is not the simplest default path for most students. It becomes attractive when local or offline use, privacy, open-weight experimentation, or avoiding per-prompt API billing matters more than using the strongest hosted model. For the broader picture of why privacy and data exposure matter at all, see Problematic cases of using AI.

Intended learning outcomes covered on this page

After working through this page, students should be better able to:

choose an appropriate access route based on availability, cost, and desired workflow
explain when local models via Ollama are a good fit and when hosted tools are the better choice
critically verify model output against files, tools, and project context before relying on it
choose an appropriate model family for a task by weighing quality, speed, cost, and local-versus-hosted constraints

When this route makes sense

you want local or offline use
you want more control over where data goes
you want to experiment with open-weight models
you do not want every prompt to depend on a remote subscription or API
you are willing to trade convenience for more control

When not to start here

you want the quickest path to a strong coding agent
you mainly need reliable repository help, not model experimentation
your machine already struggles with development tools
you do not want to think about local servers, model downloads, or hardware limits

If that sounds like you, start with Model access, GitHub Copilot CLI, or Claude Code instead.

Realistic expectations

Local does not automatically mean better. The tradeoffs are simply different:

smaller models are easier to run than larger coding models
speed and quality depend heavily on your machine
local models can be useful for drafting, summarising, classification, and simple code help
for harder repository work or ambiguous debugging, hosted tools are often still easier and stronger

Ollama also offers cloud options, but this page focuses on the local-first workflow.

A basic Ollama workflow

Install Ollama from the official download page.
Pull a model.
Make sure it runs locally.

For example:

ollama pull gemma3
ollama list
ollama run gemma3

If one model feels too slow, switch to a smaller one before you keep tuning settings.

On most systems, Ollama exposes a local server that tools can connect to.

Use it with `llm`

The simplest local path for llm is the llm-ollama plugin:

llm install llm-ollama
llm ollama models
llm -m gemma3 "Explain what this repository is about"

By default, llm-ollama talks to a local Ollama server at localhost:11434. If your server is elsewhere, set OLLAMA_HOST.

Use it with OpenCode

The local OpenCode route is:

Make sure Ollama is running and has at least one model pulled.
Start opencode.
Run /connect.
Choose Ollama.
Confirm the local server address if OpenCode asks for it.
Run /models and select a model.

If tool-calling does not work well, OpenCode recommends increasing num_ctx, starting around 16k to 32k.

Hardware and workflow caveats

disk space, RAM, and GPU availability matter
a model that technically runs may still be too slow to be pleasant
larger coding models are often the hardest to run well on student hardware
start with one small model and one simple prompt before building a bigger workflow on top of it

This is a good route for learning and experimentation, but it is not a promise that your laptop will handle every model comfortably.

Where this fits in this course

use GitHub Copilot CLI or Claude Code if you want the easiest strong hosted coding workflow
use OpenCode when you want an open source coding agent that can connect to Ollama
use the Python package llm when you want local models inside shell pipelines, chats, or Python code

Good habits

verify ollama run ... works before adding another tool on top
keep one or two models installed at first
use ollama list, llm ollama models, or OpenCode /models to check what is actually available
remember that local models still need verification; local does not mean correct

Short version

Install Ollama.
Pull one model and make sure ollama run ... works.
Connect it to llm or OpenCode.
Use local models when privacy, offline use, or open-weight experimentation matters.
Prefer hosted tools when you want the easiest or strongest repository workflow.

Next step

Official links

This site is open source. Improve this page.