Skip to content

ZeroTrace Companion

AI Chat

The chat interface — drawer, history, streaming, cost tracking, tool-call rendering.

The AI chat is the main interaction surface. Open the drawer with Ctrl+Shift+A, type a message, get a response.

Layout

The drawer opens on the right side of the window:

RegionWhat it does
ConversationMessage history with user / assistant turns and tool-call cards
InputWhere you type the next message
ToolbarProvider / model picker, clear chat, export, settings
Status badgeGreen / yellow / red indicator for the active provider
Cost tagFor cloud providers, shows estimated cost-so-far for the conversation

User messages right-align; assistant messages left-align. Tool calls render as collapsible cards inline with the conversation.

Sending a message

Type in the input field, press Ctrl+Enter (or click Send). The model receives:

  • Your message.
  • The conversation history (within the model's context window).
  • The system prompt (set in setup).
  • Optional context the assistant can see (active session, connected device, library snapshot).
  • The complete tool catalog (Companion's built-in tools plus any connected MCP server's tools).

For providers that support streaming, the response renders token-by-token as it generates. For providers that don't (or for tool-call responses, where the full call must arrive first), the response appears all at once.

Streaming responses

Most modern providers stream. You'll see tokens appearing in real time as the model generates them. Benefits:

  • Faster perceived latency — you start reading immediately.
  • Early interrupt — if the response is going off-track, hit stop before the model finishes.
  • Token counter updates live — useful when you want to track context-window usage.

Streaming and tool calling interact: when the model calls a tool, the streaming pauses, the tool executes, the result returns, and the model continues. You see the tool-call card appear mid-response.

Cost tracking (cloud providers)

For cloud providers (OpenAI, Anthropic, OpenRouter, paid Custom), Companion tracks estimated cost per response based on:

  • The model's published per-1M input / output token pricing.
  • The provider-reported input and output token counts.

Display:

  • Per-response cost tag — small chip under each cloud response showing the estimated cost in USD.
  • Conversation total — the drawer header shows the running total for the current conversation.
  • Per-provider total — settings → AI shows accumulated cost across the current session per provider.

Cost estimates are estimates. The actual bill from your provider is authoritative. Companion's estimate uses pricing from the provider's catalog at request time and the token counts the provider returned.

Multi-turn conversation

The assistant remembers earlier turns in the same conversation. You can:

  • Reference earlier responses ("expand on what you said about RSSI patterns").
  • Build on findings ("now run that same query for the previous session").
  • Correct mistakes ("that's not what I meant — I want only Apple devices").

The conversation continues until you clear it (Ctrl+L) or close Companion. Conversations are not currently persisted across launches; export important findings before closing.

Switching providers mid-conversation

Use the provider picker in the toolbar. The next message routes to the new provider; earlier messages remain in context (the new provider sees them as plain text).

Useful for hybrid workflows:

  1. Start with Ollama, ask sensitive questions about AirLeak data.
  2. Switch to Anthropic for help writing a follow-up script.
  3. Switch back to Ollama to continue with sensitive analysis.

The chat history is shared; the destination of new messages changes.

What the assistant can see

The assistant has visibility into Companion's runtime state. By default:

  • Connected device info — model, firmware, mode.
  • Active session metadata — when an AirLeak session is running.
  • Library statistics — counts and aggregates, not individual device records.

When the assistant calls a tool, the tool's result is added to the conversation context for that turn. The assistant can then summarise or further query the result.

The assistant does not automatically see:

  • Your terminal command history.
  • Personal notes or labels.
  • Settings.

It accesses these only when you explicitly ask or when a tool call retrieves them.

Tool-call rendering

When the assistant calls a tool (Companion-built-in or MCP-provided), the call appears inline as a card:

  • Tool name and the arguments.
  • Source — which provider supplied the tool (Companion built-in / MCP server name).
  • Result (collapsible — full result available on click).
  • Time to execute.

You can see exactly what the assistant did. Nothing happens behind a hidden curtain.

Tone and voice

The default system prompt produces a tone that's:

  • Concise (matches the application's general voice).
  • Practical — focuses on what to do, not why something is interesting in general.
  • Calibrated about uncertainty — says when it doesn't know.

Override with a custom system prompt if you want a different style.

For experimentation, ask the assistant to "list every tool you can call." The model lists its full tool catalog with descriptions — useful for discovering what the assistant can actually do, especially when you have MCP servers connected.

Clearing the conversation

Ctrl+L clears the chat. The next message starts a fresh conversation with no memory of the previous one.

Useful for:

  • Starting a new investigation context.
  • Recovering when the model has gone off track.
  • Reducing context window usage on long-running conversations.
  • Resetting accumulated cost tracking for a new task.

Stop generation

If the model is producing a response that's clearly off-track, you can interrupt mid-stream. The drawer shows a "stop" button next to the streaming response. Click it to abort.

The aborted response stays in the conversation history (showing what was generated up to the abort) — useful for "I see where this is going wrong, let me redirect."

Limits and edge cases

  • Long contexts. Each model has a finite context window. Cloud models from Anthropic and OpenAI have large windows (200k+); local 7B models often have 8k-32k. When you fill the window, earlier messages drop from the model's memory. The chat shows a banner when this happens.
  • Tool failures. If a tool call fails, the assistant sees the failure and usually adapts (apologises, tries something else, asks for clarification). Persistent failures suggest a bug.
  • Hallucinations. Smaller / cheaper models hallucinate more than larger / more expensive ones. Verify any claim that matters, regardless of provider.
  • Provider-specific quirks. Each provider's tool calling has slight differences. Companion abstracts most of them; some edge cases may behave differently across providers.

Exporting a conversation

The drawer's toolbar has an export button. Choices:

  • Plain text — readable transcript.
  • Markdown — formatted for reports.
  • JSON — full structure including tool calls, results, and cost data.

Useful for documenting how an investigation reached a conclusion, sharing a useful conversation with a teammate, or building an audit trail for sensitive work.

What chats are not for

  • Long-form documentation. The chat is conversational; documentation reads better in real docs (these pages, your investigation profile notes).
  • Deterministic operations. "Run this exact query" is better in the relevant view directly. The assistant is for discovery and explanation, not for clicking buttons faster.
  • Critical decisions without verification. The model can be confidently wrong — even the best ones. For decisions that matter, verify what it tells you against the actual data.

Command Palette

Search for a command to run...