Since v0.5.0, Octo ships a clean embeddable engine at octo.core. It has no environment variable reading, no CLI dependencies, and no side effects at import time — making it safe to embed in web services, workers, or test harnesses.
Quick Start
import asyncio
from octo.core import OctoEngine, OctoConfig
from octo.core.storage import FilesystemStorage
config = OctoConfig(
llm_provider="anthropic",
llm_credentials={"api_key": "sk-ant-..."},
default_model="claude-sonnet-4-5-20250929",
storage=FilesystemStorage(root="/path/to/.octo"),
)
engine = OctoEngine(config)
async def main():
response = await engine.invoke("Hello!", thread_id="conv-123")
print(response.content)
await engine.close()
asyncio.run(main())
OctoConfig
The configuration dataclass. Caller provides everything — no env var reading.
Required Fields
| Field | Type | Description |
|---|
llm_provider | str | "anthropic", "openai", "bedrock", "azure", "github", "gemini", or "local" |
llm_credentials | dict | Provider-specific credentials (see below) |
Credential Keys by Provider
| Provider | Required Keys | Optional Keys |
|---|
| Anthropic | api_key | — |
| OpenAI | api_key | — |
| Bedrock | region, access_key_id, secret_access_key | — |
| Azure | api_key, endpoint | api_version |
| GitHub | api_key | base_url, anthropic_base_url |
| Gemini | api_key | — |
| Local | — | base_url, api_key |
Optional Fields
| Field | Default | Description |
|---|
default_model | claude-sonnet-4-5-20250929 | Model for general routing and chat |
high_tier_model | (same as default) | Model for complex reasoning |
low_tier_model | (same as default) | Model for summarization and cheap tasks |
model_profile | "balanced" | "quality", "balanced", or "budget" |
storage | None | StorageBackend instance for files, memory, skills |
checkpoint_backend | "sqlite" | "sqlite" or "postgres" |
checkpoint_config | {} | Backend-specific config (see below) |
context_limit | 200000 | Max context window tokens |
tool_result_limit | 40000 | Max chars for a single tool result |
summarization_trigger_tokens | 40000 | Token count to trigger auto-summarization |
summarization_keep_tokens | 8000 | Tokens of recent history to keep after summarization |
supervisor_msg_char_limit | 30000 | Per-message char limit for supervisor |
preloaded_tools | [] | Pre-created LangChain tools (skip MCP loading) |
agent_configs | [] | List of AgentConfig for custom agents |
system_prompt | "" | Custom system prompt override |
Validation
Config is validated automatically by OctoEngine on construction. Call config.validate() manually for early checks:
from octo.core.config import OctoConfigError
try:
config.validate()
except OctoConfigError as e:
print(e) # Lists all validation errors
OctoEngine
The main entry point. Build once per configuration, invoke per message.
Methods
| Method | Description |
|---|
await engine.invoke(message, thread_id=...) | Process one message. Returns OctoResponse. |
async for event in engine.stream(message, thread_id=...) | Stream response events (tokens, tool calls). |
await engine.close() | Clean up resources (DB connections, etc.) |
engine.is_built | Whether the graph has been lazily built yet. |
OctoResponse
@dataclass
class OctoResponse:
content: str # The assistant's reply
thread_id: str # Conversation identifier
context_tokens_used: int = 0 # Tokens consumed
context_tokens_limit: int = 200000
agent_name: str = "" # Which agent produced the response
error: str | None = None # Non-None if invocation failed
error_traceback: str | None = None
Errors are captured in the response (not raised), so callers can always inspect the result.
StorageBackend
The StorageBackend protocol defines async file operations for memory, skills, and workspace files.
Built-in Backends
FilesystemStorage
S3Storage
Local filesystem. No extra dependencies.from octo.core.storage import FilesystemStorage
storage = FilesystemStorage(root="/path/to/.octo")
S3-compatible object storage. Requires the s3 extra.pip install "octo-agent[s3]"
from octo.core.storage import S3Storage
storage = S3Storage(
bucket="my-bucket",
prefix="octo/",
region="us-east-1",
)
Protocol
Any object implementing these async methods can be used as a storage backend:
class StorageBackend(Protocol):
async def read(self, path: str) -> str: ...
async def write(self, path: str, content: str) -> None: ...
async def exists(self, path: str) -> bool: ...
async def glob(self, pattern: str) -> list[str]: ...
Checkpointing
Conversation state is persisted via LangGraph checkpointers.
SQLite (default)
PostgreSQL
No extra dependencies. Path defaults to <workspace>/.octo/octo.db.config = OctoConfig(
checkpoint_backend="sqlite",
checkpoint_config={"path": "/path/to/octo.db"},
...
)
Requires the postgres extra.pip install "octo-agent[postgres]"
config = OctoConfig(
checkpoint_backend="postgres",
checkpoint_config={"dsn": "postgresql://user:pass@host/db"},
...
)
Thread Safety
OctoEngine is not thread-safe. Each instance mutates global module state during graph build. Do not share engine instances across threads, and do not create multiple engines with different configs in the same process simultaneously.For multi-tenant scenarios, use one engine per process/worker.
Installation
The core engine is included in the base package:
pip install octo-agent # Core engine only
pip install "octo-agent[cli]" # Core + CLI (Rich terminal)
pip install "octo-agent[all]" # Everything