Honest comparison
The coding agent that runs on your machine — or any API you choose. Not locked to theirs.
Every cloud coding agent does the same thing: picks a provider for you, locks you in, charges per token, and ships your code to a server you don't control. miii runs on Ollama by default — local, free, private. Same agent. Same Beacon context engine. Your provider, your choice, your cost. No subscription on top.
Feature matrix
| miii | Claude Code | Cursor | Codex CLI | |
|---|---|---|---|---|
| Cost | $0 local · API at cost | $20–400+/mo | $20–40/mo | Pay-per-token |
| Your code stays local | ✓ | ✗ | ✗ | ✗ |
| Works offline | ✓ | ✗ | ✗ | ✗ |
| Open source | ✓ MIT | ✗ | ✗ | ✗ |
| Autonomous agent | ✓ | ✓ | partial | ✓ |
| IDE required | ✗ | ✗ | VS Code fork | ✗ |
| Goal-aware context (Beacon)★ | ✓ | ✗ | ✗ | ✗ |
| Per-tool context compression★ | ✓ | ✗ | ✗ | ✗ |
| Dynamic context window | ✓ auto | hardcoded | ✗ | ✗ |
| OS-level shell sandbox | ✓ | ✗ | ✗ | ✗ |
| Shadow git (model edit log) | ✓ | ✗ | ✗ | ✗ |
| Vendor lock-in | None | Anthropic | Cursor + OpenAI | OpenAI |
The real cost of cloud agents
Without context management, context grows every iteration. By depth 10, each LLM call carries the full history of every file read, every command run, every test output — verbatim.
Simple task
bug fix, 3–5 tool calls
Complex task
refactor, 10–15 tool calls, multi-file
Annual cost at real usage
20 tasks/day · 220 working days · 50% complex. That's 4,400 tasks/year.
bars scaled to Claude Code annual cost as reference
| miii (Ollama) | $0 |
| miii (Anthropic API) | ~$833–1,493/yr |
| miii (OpenAI API) | ~$400–640/yr |
| Claude Code | $1,100–1,760/yr |
| Codex CLI | $400–640/yr |
| Cursor Pro | $240/yr |
| GitHub Copilot | $120/yr |
Over 3 years
miii with Anthropic API costs the same tokens as Claude Code — but Beacon's 60–70% context compression means fewer tokens per task. No subscription fee. No markup.
Beacon — why miii wins on long tasks
Context window at each depth
Beacon extracts your goal at depth 0, then injects a live state block just before the last message at every subsequent depth. No LLM call. Extracted in a single split. Injected every time.
How Beacon compresses each tool
| Tool result | Reduction |
|---|---|
| read_file (200 lines) | 97% |
| list_files (50 entries) | 84% |
| run_command (100 lines) | 95% |
| run_tests (full output) | 90% |
| Error messages | — |
In a 15-step task
API cost savings (Sonnet 4.5 · $3/MTok)
For Ollama users
Beacon is the difference between a task completing and the context window crashing. An 8K-context model hits the wall at depth 6–8 on a complex task. With Beacon, the same model runs to depth 20.
Privacy
Your .env files, proprietary algorithms, unreleased features, client codebases → Anthropic's servers.
Your .env files, proprietary algorithms, unreleased features, client codebases → Cursor Inc + OpenAI or Anthropic.
Your .env files, proprietary algorithms, unreleased features, client codebases → OpenAI's servers.
Runs on Ollama. Your code never touches a network. When you opt into a cloud provider, you're making a conscious, per-session decision. You decide what leaves and when.
Claude Code, Cursor, and Codex have no local fallback. Every task, every prompt, every file read goes to the cloud. For fintech, healthcare, legal, and defence: this is the difference between compliant and non-compliant.
Who miii is for
Want a capable coding agent at $0 (local) or raw API cost (cloud). No subscription on top of your API key.
Working on client code, IP, or credentials in-tree. Code must not leave the machine.
Travel, isolated networks, regulated environments, zero-internet setups. miii works where cloud AI can't.
Keep hitting context limits on complex autonomous tasks. Beacon compresses 60–70% of context — small models run to depth 20.
Use Claude or OpenAI models without a subscription. Switch between Llama, Qwen, DeepSeek, and hosted models mid-session.
MIT licensed. No vendor dependency. Own your tools. Audit every line.
The honest trade-off
Cloud models (Claude Sonnet 4.5, o3) have higher raw accuracy than most local Ollama models today. For a one-off question where cost/privacy don't matter, Claude Code or Codex CLI will produce a better answer.
For everyday development — refactoring, debugging, writing tests, navigating codebases — qwen2.5-coder, deepseek-coder-v2, and llama3.1 are more than sufficient. Beacon keeps them on task through the whole job.
When you hit a hard problem, /cloud in miii escalates one prompt to Claude Opus 4 or o3. You decide what leaves your machine and when.