ZeroTrace Companion

Providers

Name: ZeroTrace
Address: DE

Six AI providers, local and cloud — what each one is good for and what each one costs you in privacy and money.

Companion supports six AI providers. They split cleanly into local (run on your machine, your data stays put) and cloud (your prompts travel to a third-party provider, you get better models in exchange).

Provider	Type	API key needed	Default model	Notes
Ollama	Local	No	`qwen2.5:7b`	The default. Fully local, no telemetry, free.
LM Studio	Local	No	(whatever you load)	GUI-based local runner. Good for picking models visually.
OpenAI	Cloud	Yes	`gpt-4o-mini`	GPT-4o, o1, o3, etc. Sends prompts to OpenAI.
OpenRouter	Cloud	Yes	`openai/gpt-4o-mini`	Aggregator with hundreds of models from many providers.
Anthropic	Cloud	Yes	`claude-haiku-4-5`	Claude family. Sends prompts to Anthropic.
Custom	Either	Usually yes	(you set)	Any OpenAI-compatible endpoint — vLLM, LiteLLM, Together.ai, your own deployment.

Local providers

Ollama

The default and recommended local runner. Free, open-source, cross-platform. Runs on http://localhost:11434 after install.

Best for:

Privacy-sensitive work where your prompts and tool results must stay local.
Offline operation (no internet after the model download).
Free unlimited usage.

Trade-off:

Quality is bounded by what your hardware can run.

See setup for the full install path.

LM Studio

GUI-based local runner. You browse models in a built-in catalog, download them with a click, and run them through an OpenAI-compatible API.

Best for:

Investigators who prefer a graphical model picker over command-line tools.
Quickly trying many models from the LM Studio catalog.

Trade-off:

Heavier than Ollama, slightly less efficient on most hardware.

Default Base URL: http://localhost:1234/v1.

Cloud providers

OpenAI

Direct connection to OpenAI's API. Models include gpt-4o, gpt-4o-mini, o1, o3-mini, etc.

Best for:

Top-tier general-purpose performance.
Tool-calling reliability — OpenAI's tool-calling implementation is the most battle-tested.
Fast response times globally.

Trade-off:

Your prompts and tool results travel to OpenAI's servers.
Pay per token; check pricing on openai.com/api/pricing.

API key from platform.openai.com/api-keys.

OpenRouter

Aggregator that exposes hundreds of models from many providers behind one OpenAI-compatible API. Includes Claude, GPT, Gemini, Llama, Qwen, DeepSeek, Mistral, and many others.

Best for:

Trying a wide variety of models without separate accounts.
Picking the cheapest provider for a given model on the fly.
Access to models that aren't directly available in your region.

Trade-off:

Your prompts pass through OpenRouter and the upstream provider.
Pricing varies per model — Companion shows estimated cost per request.

API key from openrouter.ai/keys.

Anthropic

Direct connection to Anthropic's API. Models include the Claude family — claude-opus-4-5, claude-sonnet-4-5, claude-haiku-4-5, etc.

Best for:

Strong reasoning on complex multi-step queries.
Long-context investigations (Claude's context windows are large).
Tool-calling with high reliability.

Trade-off:

Your prompts and tool results travel to Anthropic's servers.
Pay per token; check pricing on anthropic.com/pricing.

API key from console.anthropic.com/settings/keys.

Custom (OpenAI-compatible)

Any endpoint that speaks the OpenAI API. Companion sends requests to whatever URL you configure with whatever API key (if any) you provide.

Best for:

Self-hosted deployments — vLLM, llama.cpp server, Text Generation Inference, your own infrastructure.
LiteLLM proxy in front of multiple providers.
Together.ai, Groq, Fireworks, or other non-listed cloud providers with OpenAI-compatible APIs.
Air-gapped enterprise deployments where you run a private LLM gateway.

Trade-off:

You are responsible for the endpoint's availability, security, and cost.
Privacy depends on what you're connecting to.

Choosing between local and cloud

You want...	Pick
Maximum privacy, free, offline-capable	Local (Ollama or LM Studio)
Top-tier model quality, fast	Cloud (OpenAI or Anthropic)
Many models cheaply for experimentation	OpenRouter
Self-hosted enterprise gateway	Custom
Hybrid — local for sensitive work, cloud for everything else	Switch as needed; Companion makes it one click

Switching providers

Settings → AI → Provider. Pick from the dropdown. Companion remembers each provider's last-used Base URL, model, and API key, so switching back is instant.

The conversation in the assistant drawer continues — the next message routes to the new provider. Earlier messages stay in the conversation history (the new provider sees them as plain context).

For "privacy when it matters, capability when it doesn't," set up Ollama as your default and Anthropic or OpenAI as a secondary. Use Ollama for anything touching AirLeak data; switch to the cloud provider for help writing scripts, debugging code, or general questions where the data is non-sensitive.

Cost tracking

For cloud providers, Companion shows the estimated cost per request based on the model's published per-1M-token pricing:

Per-request — the chat drawer shows estimated cost on each cloud response.
Aggregated — settings shows a running total per provider for the current session.

Cost is estimated, not authoritative. The actual bill from your provider is the source of truth. Companion's estimate uses the published rate at the time of the request and the token counts the provider returned.

Local providers always show zero cost.

Model capabilities per provider

Different models — even from the same provider — support different capabilities:

Capability	Detection
Tool calling	Required for the assistant to drive Companion or call MCP tools
Vision	Image inputs (not currently used by Companion's flows; future)
Streaming	Token-by-token response rendering

Companion shows the active model's capabilities under settings → AI. Models without tool calling still work for conversational Q&A but can't operate Companion's tools.

What does not change between providers

Companion's tool catalog — same tools available regardless of provider.
MCP servers — connected MCP servers' tools work across all providers.
The chat drawer — same interface whether you're on Ollama or Anthropic.
Confirmation prompts for state-changing tools — always required, regardless of provider.

The provider is a swap of the brain. The body — Companion's tools, MCP integrations, UI conventions — stays the same.

Command Palette