Skip to main content
All configuration lives in .env at your workspace root. The setup wizard (octo init) generates this file, or you can create it manually from .env.example.

Config File Locations

Octo resolves its workspace by walking up from the current directory looking for .octo/ or .env. If none is found, it falls back to platform defaults.

Resolution Order

  1. Walk up from cwd — first directory containing .octo/ or .env
  2. Platform default (if no project workspace found):
PlatformDefault LocationState Dir
macOS / Linux~~/.octo/
Windows%LOCALAPPDATA%/octo%LOCALAPPDATA%/octo/.octo/
  1. Override with OCTO_HOME env var — points to the workspace root

Config File Precedence

Octo checks two locations for .env and .mcp.json, loading the first it finds:
FilePrimary (checked first)Fallback
.env.octo/.env<workspace>/.env
.mcp.json.octo/.mcp.json<workspace>/.mcp.json
For global installs (via uvx or uv tool), keep all config inside ~/.octo/ — create ~/.octo/.env and ~/.octo/.mcp.json. This way Octo works from any directory without a project-local workspace.

LLM Providers

Octo supports 7 providers. You only need to configure one — or mix multiple providers across tiers.
The simplest option — direct API access to Claude models.
ANTHROPIC_API_KEY=sk-ant-...
DEFAULT_MODEL=claude-sonnet-4-5-20250929

Auto-Detection

The model factory auto-detects the provider from the model name. Use the universal provider/ prefix for explicit routing, or rely on legacy heuristics for unprefixed names:
Model name patternProvider
anthropic/claude-*Anthropic (prefix)
bedrock/eu.anthropic.*AWS Bedrock (prefix)
openai/gpt-*OpenAI (prefix)
gemini/gemini-*Google Gemini (prefix)
local/llama3Local / Custom (prefix)
github/openai/gpt-4.1GitHub Models (prefix)
claude-*Anthropic (heuristic)
gemini-*Google Gemini (heuristic)
eu.anthropic.*, us.anthropic.*AWS Bedrock (heuristic)
gpt-*, o1-*, o3-*, o4-*OpenAI (heuristic)
gpt-* + AZURE_OPENAI_ENDPOINT setAzure OpenAI (heuristic)
Override with LLM_PROVIDER if needed:
LLM_PROVIDER=bedrock  # anthropic | bedrock | openai | azure | github | gemini | local

Mixed Providers

You can use different providers for each tier — just use provider/ prefixes in your model names and leave LLM_PROVIDER unset:
# Each tier auto-detects its provider from the prefix
HIGH_TIER_MODEL=anthropic/claude-sonnet-4-5-20250929
DEFAULT_MODEL=gemini/gemini-2.5-flash
LOW_TIER_MODEL=local/llama3

# Provide credentials for each provider used
ANTHROPIC_API_KEY=sk-ant-...
GOOGLE_API_KEY=AI...
OPENAI_API_BASE=http://localhost:8000/v1
Mixed providers are great for cost optimization: use a powerful cloud model for complex reasoning, a fast Gemini model for general routing, and a free local model for summarization.

Model Tiers

Octo uses three tiers to balance cost vs quality. Different agents use different tiers:
TierUsed ForExample
HIGHComplex reasoning, architecture, multi-step planningclaude-opus-4-5-20250929
DEFAULTSupervisor routing, general chat, tool useclaude-sonnet-4-5-20250929
LOWSummarization, simple workers, cost-sensitive tasksclaude-haiku-4-5-20251001
DEFAULT_MODEL=claude-sonnet-4-5-20250929
HIGH_TIER_MODEL=claude-opus-4-5-20250929
LOW_TIER_MODEL=claude-haiku-4-5-20251001

Model Profiles

Profiles are presets that map tiers to agent roles:
ProfileSupervisorWorkersHigh-tier agents
qualityhighdefaulthigh
balanceddefaultlowhigh
budgetlowlowdefault
MODEL_PROFILE=balanced
Switch at runtime with /profile <name>.

Agent Directories

Load agents from external projects by pointing to their AGENT.md directories:
AGENT_DIRS=/path/to/project-a/.claude/agents:/path/to/project-b/.claude/agents
Colon-separated. Each directory is scanned for */AGENT.md files.

Middleware Tuning

# Max chars for a single tool result before truncation
TOOL_RESULT_LIMIT=20000

# Context window summarization triggers (whichever fires first)
SUMMARIZATION_TRIGGER_FRACTION=0.7
SUMMARIZATION_TRIGGER_TOKENS=40000

# Tokens of recent history to keep after summarization
SUMMARIZATION_KEEP_TOKENS=8000

# Supervisor per-message char limit
SUPERVISOR_MSG_CHAR_LIMIT=30000

# Timeout for claude -p subprocess calls (seconds)
CLAUDE_CODE_TIMEOUT=2400

Observability (Langfuse)

Octo supports Langfuse tracing for monitoring AI decisions, token usage, and cost. Toggle with a single env var:
LANGFUSE_ENABLED=true
LANGFUSE_PUBLIC_KEY=pk-lf-...
LANGFUSE_SECRET_KEY=sk-lf-...
LANGFUSE_HOST=https://cloud.langfuse.com  # optional, for self-hosted
See Langfuse Integration for full setup guide.

Voice

Octo supports pluggable STT and TTS engines — cloud (ElevenLabs) or local (any command-line tool).
# Engine selection (default: elevenlabs for both)
VOICE_STT_ENGINE=whisper      # elevenlabs | whisper
VOICE_TTS_ENGINE=kokoro       # elevenlabs | kokoro

# ElevenLabs (cloud)
ELEVENLABS_API_KEY=your_key
ELEVENLABS_VOICE_ID=...       # optional

# Local engines — full command line (Octo appends file path as last arg)
WHISPER_COMMAND=/path/to/venv/bin/python /path/to/transcribe.py
KOKORO_COMMAND=/path/to/venv/bin/python /path/to/synthesize.py
See Voice for the subprocess protocol, example scripts, and voiceover text preparation.

MCP Servers

MCP server configuration lives in .mcp.json (checked in .octo/.mcp.json first, then workspace root). See .mcp.json.example for a template, and MCP Servers for management commands. Octo includes a built-in MS Teams server — add it to .mcp.json for chat integration.

Next Steps