Configuration

All configuration lives in .env at your workspace root. The setup wizard (octo init) generates this file, or you can create it manually from .env.example.

Config File Locations

Octo resolves its workspace by walking up from the current directory looking for .octo/ or .env. If none is found, it falls back to platform defaults.

Resolution Order

Walk up from cwd — first directory containing .octo/ or .env
Platform default (if no project workspace found):

Platform	Default Location	State Dir
macOS / Linux	`~`	`~/.octo/`
Windows	`%LOCALAPPDATA%/octo`	`%LOCALAPPDATA%/octo/.octo/`

Override with OCTO_HOME env var — points to the workspace root

Config File Precedence

Octo checks two locations for .env and .mcp.json, loading the first it finds:

File	Primary (checked first)	Fallback
`.env`	`.octo/.env`	`<workspace>/.env`
`.mcp.json`	`.octo/.mcp.json`	`<workspace>/.mcp.json`

For global installs (via uvx or uv tool), keep all config inside ~/.octo/ — create ~/.octo/.env and ~/.octo/.mcp.json. This way Octo works from any directory without a project-local workspace.

LLM Providers

Octo supports 7 providers. You only need to configure one — or mix multiple providers across tiers.

The simplest option — direct API access to Claude models.

ANTHROPIC_API_KEY=sk-ant-...
DEFAULT_MODEL=claude-sonnet-4-5-20250929

Uses Claude models via AWS. Requires AWS credentials with Bedrock access.

AWS_REGION=us-east-1
AWS_ACCESS_KEY_ID=...
AWS_SECRET_ACCESS_KEY=...
DEFAULT_MODEL=us.anthropic.claude-sonnet-4-5-20250929-v1:0

Bedrock model IDs include a region prefix (us., eu.) and version suffix (:0).

OPENAI_API_KEY=sk-...
DEFAULT_MODEL=gpt-4o

AZURE_OPENAI_API_KEY=...
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
AZURE_OPENAI_API_VERSION=2024-12-01-preview
DEFAULT_MODEL=gpt-4o

Free tier available with a GitHub PAT.

GITHUB_TOKEN=ghp_...
DEFAULT_MODEL=github/openai/gpt-4.1

GitHub Models auto-routes to the right LangChain class:

github/claude-* → ChatAnthropic
Everything else → ChatOpenAI

Gemini 2.5 Flash, Pro, and Flash-Lite via Google AI Studio.

GOOGLE_API_KEY=AI...
DEFAULT_MODEL=gemini-2.5-flash

Also accepts GEMINI_API_KEY as an alias. Auto-detects Vertex AI from environment variables.

Any OpenAI-compatible endpoint — vLLM, Ollama, llama.cpp, etc.

OPENAI_API_BASE=http://localhost:8000/v1
DEFAULT_MODEL=local/llama3

The local/ prefix is required to distinguish from the openai provider. API key is optional (defaults to not-needed).

Auto-Detection

The model factory auto-detects the provider from the model name. Use the universal provider/ prefix for explicit routing, or rely on legacy heuristics for unprefixed names:

Model name pattern	Provider
`anthropic/claude-*`	Anthropic (prefix)
`bedrock/eu.anthropic.*`	AWS Bedrock (prefix)
`openai/gpt-*`	OpenAI (prefix)
`gemini/gemini-*`	Google Gemini (prefix)
`local/llama3`	Local / Custom (prefix)
`github/openai/gpt-4.1`	GitHub Models (prefix)
`claude-*`	Anthropic (heuristic)
`gemini-*`	Google Gemini (heuristic)
`eu.anthropic.`, `us.anthropic.`	AWS Bedrock (heuristic)
`gpt-`, `o1-`, `o3-`, `o4-`	OpenAI (heuristic)
`gpt-*` + `AZURE_OPENAI_ENDPOINT` set	Azure OpenAI (heuristic)

Override with LLM_PROVIDER if needed:

LLM_PROVIDER=bedrock  # anthropic | bedrock | openai | azure | github | gemini | local

Mixed Providers

You can use different providers for each tier — just use provider/ prefixes in your model names and leave LLM_PROVIDER unset:

# Each tier auto-detects its provider from the prefix
HIGH_TIER_MODEL=anthropic/claude-sonnet-4-5-20250929
DEFAULT_MODEL=gemini/gemini-2.5-flash
LOW_TIER_MODEL=local/llama3

# Provide credentials for each provider used
ANTHROPIC_API_KEY=sk-ant-...
GOOGLE_API_KEY=AI...
OPENAI_API_BASE=http://localhost:8000/v1

Mixed providers are great for cost optimization: use a powerful cloud model for complex reasoning, a fast Gemini model for general routing, and a free local model for summarization.

Model Tiers

Octo uses three tiers to balance cost vs quality. Different agents use different tiers:

Tier	Used For	Example
HIGH	Complex reasoning, architecture, multi-step planning	`claude-opus-4-5-20250929`
DEFAULT	Supervisor routing, general chat, tool use	`claude-sonnet-4-5-20250929`
LOW	Summarization, simple workers, cost-sensitive tasks	`claude-haiku-4-5-20251001`

DEFAULT_MODEL=claude-sonnet-4-5-20250929
HIGH_TIER_MODEL=claude-opus-4-5-20250929
LOW_TIER_MODEL=claude-haiku-4-5-20251001

Model Profiles

Profiles are presets that map tiers to agent roles:

Profile	Supervisor	Workers	High-tier agents
`quality`	high	default	high
`balanced`	default	low	high
`budget`	low	low	default

MODEL_PROFILE=balanced

Switch at runtime with /profile <name>.

Agent Directories

Load agents from external projects by pointing to their AGENT.md directories:

AGENT_DIRS=/path/to/project-a/.claude/agents:/path/to/project-b/.claude/agents

Colon-separated. Each directory is scanned for */AGENT.md files.

Middleware Tuning

# Max chars for a single tool result before truncation
TOOL_RESULT_LIMIT=20000

# Context window summarization triggers (whichever fires first)
SUMMARIZATION_TRIGGER_FRACTION=0.7
SUMMARIZATION_TRIGGER_TOKENS=40000

# Tokens of recent history to keep after summarization
SUMMARIZATION_KEEP_TOKENS=8000

# Supervisor per-message char limit
SUPERVISOR_MSG_CHAR_LIMIT=30000

# Timeout for claude -p subprocess calls (seconds)
CLAUDE_CODE_TIMEOUT=2400

Observability (Langfuse)

Octo supports Langfuse tracing for monitoring AI decisions, token usage, and cost. Toggle with a single env var:

LANGFUSE_ENABLED=true
LANGFUSE_PUBLIC_KEY=pk-lf-...
LANGFUSE_SECRET_KEY=sk-lf-...
LANGFUSE_HOST=https://cloud.langfuse.com  # optional, for self-hosted

See Langfuse Integration for full setup guide.

Voice

Octo supports pluggable STT and TTS engines — cloud (ElevenLabs) or local (any command-line tool).

# Engine selection (default: elevenlabs for both)
VOICE_STT_ENGINE=whisper      # elevenlabs | whisper
VOICE_TTS_ENGINE=kokoro       # elevenlabs | kokoro

# ElevenLabs (cloud)
ELEVENLABS_API_KEY=your_key
ELEVENLABS_VOICE_ID=...       # optional

# Local engines — full command line (Octo appends file path as last arg)
WHISPER_COMMAND=/path/to/venv/bin/python /path/to/transcribe.py
KOKORO_COMMAND=/path/to/venv/bin/python /path/to/synthesize.py

See Voice for the subprocess protocol, example scripts, and voiceover text preparation.

MCP Servers

MCP server configuration lives in .mcp.json (checked in .octo/.mcp.json first, then workspace root). See .mcp.json.example for a template, and MCP Servers for management commands. Octo includes a built-in MS Teams server — add it to .mcp.json for chat integration.

Next Steps

Telegram Setup

Add Telegram transport

Teams Integration

Read and send MS Teams messages

Model Profiles

Fine-tune cost vs quality

MCP Servers

Connect external tools via MCP

Get Started

Features

Resources

Config File Locations

Resolution Order

Config File Precedence

LLM Providers

Auto-Detection

Mixed Providers

Model Tiers

Model Profiles

Agent Directories

Middleware Tuning

Observability (Langfuse)

Voice

MCP Servers

Next Steps

Telegram Setup

Teams Integration

Model Profiles

MCP Servers

Get Started

Features

Resources

​Config File Locations

​Resolution Order

​Config File Precedence

​LLM Providers

​Auto-Detection

​Mixed Providers

​Model Tiers

​Model Profiles

​Agent Directories

​Middleware Tuning

​Observability (Langfuse)

​Voice

​MCP Servers

​Next Steps

Telegram Setup

Teams Integration

Model Profiles

MCP Servers

Config File Locations

Resolution Order

Config File Precedence

LLM Providers

Auto-Detection

Mixed Providers

Model Tiers

Model Profiles

Agent Directories

Middleware Tuning

Observability (Langfuse)

Voice

MCP Servers

Next Steps