Why Self-Host an AI Agent?
By mid-2026, AI agents like Claude Code, OpenAI Codex CLI, and open-source Hermes Agent have become essential developer tools. While most people use cloud-hosted versions, self-hosting gives you:
- Data sovereignty β your code and prompts never leave your machine
- Unlimited usage β no API quotas when paired with local LLMs
- Full customization β modify the code, add plugins, craft your own prompts
- Long-term cost savings β no monthly SaaS subscriptions for heavy users
This guide walks through deploying Hermes Agent with Docker Compose β an open-source, provider-agnostic AI agent framework that can use local models (via Ollama) and cloud APIs (OpenAI, Anthropic, DeepSeek) in a mixed setup.
Prerequisites
- Server / VPS: Minimum 4GB RAM, 20GB SSD (2GB RAM is enough for cloud-only mode)
- Docker + Docker Compose installed
- Basic Linux/command line familiarity
Choosing Your Setup
| Mode | Pros | Cons | Best For |
|---|---|---|---|
| Cloud-only (GPT-5/Claude) | Fastest setup, strongest models | Monthly cost, data leaves your network | Quick prototyping |
| Hybrid (local agent + cloud LLM) | Data security + strong models | Still pays API fees | Production use |
| Fully local (Ollama + Hermes) | Free, maximum privacy | Weaker local models | R&D, high-security |
This guide uses the hybrid approach β Hermes Agent runs in Docker, LLM calls go to OpenAI or Anthropic (with Ollama as a local fallback for fast, cheap tasks).
Step 1: Docker Compose Setup
Create docker-compose.yml:
version: '3.8'
services:
hermes:
image: nousresearch/hermes-agent:latest
container_name: hermes-agent
restart: unless-stopped
volumes:
- ./hermes_data:/root/.hermes
- ./workspace:/workspace
environment:
- OPENAI_API_KEY=${O...}
- ANTHROPIC_API_KEY=${A...}
- DEEPSEEK_API_KEY=${D...}
- OLLAMA_HOST=http://ollama:11434
ports:
- "8080:8080"
depends_on:
ollama:
condition: service_healthy
command: ["hermes", "gateway", "run"]
ollama:
image: ollama/ollama:latest
container_name: ollama
restart: unless-stopped
volumes:
- ./ollama_data:/root/.ollama
ports:
- "11434:11434"
healthcheck:
test: ["CMD", "ollama", "list"]
interval: 30s
retries: 3
Step 2: Environment Setup
Create a .env file:
OPENAI_API_KEY=sk-... ANTHROPIC_API_KEY=sk-ant-... DEEPSEEK_API_KEY=sk-...
TZ=Asia/Hong_Kong
At least one LLM provider key is required.
Step 3: Launch
mkdir -p hermes_data workspace ollama_data
docker compose up -d
docker compose logs hermes -f
On first boot you should see:
[2026-06-01 10:00:00] Hermes Agent started successfully
[2026-06-01 10:00:01] Gateway listening on port 8080
Optional: Pull a Local Model
docker exec ollama ollama pull mistral
docker exec ollama ollama pull llama3.1
docker exec ollama ollama run mistral "Hello, how are you?"
Step 4: Configure Hermes
Configure directly inside the container:
docker exec -it hermes-agent hermes config set model.default claude-sonnet-4
docker exec -it hermes-agent hermes config set model.provider anthropic
Key config settings to customize:
model:
default: claude-sonnet-4
provider: anthropic
agent:
max_turns: 90
tool_use_enforcement: true
terminal:
backend: docker
workdir: /workspace
memory:
memory_enabled: true
user_profile_enabled: true
delegation:
model: gpt-5-mini
max_iterations: 30
Step 5: Connect a Chat Platform (Optional)
Connect Hermes to Telegram, Discord, or Slack:
docker exec -it hermes-agent hermes gateway setup
Choose your platform, enter the bot token or webhook URL, and restart:
docker compose restart hermes
Using Your Agent
Via API
curl -X POST http://localhost:8080/api/chat \
-H "Content-Type: application/json" \
-d '{"message": "Create a Flask API with 3 endpoints", "session": "dev-project"}'
Via Telegram
Just send a message to your bot:
/create a FastAPI + SQLite todo list with CRUD endpoints and tests
Via Cron (Scheduled Tasks)
docker exec hermes-agent hermes cron create "every day 9am" \
--prompt "Check today's GitHub notifications and summarize" \
--delivery telegram
Advanced: Multi-Agent Setup
Run specialized agent profiles as separate containers:
agent-frontend:
image: nousresearch/hermes-agent:latest
container_name: hermes-frontend
environment:
- HERMES_PROFILE=frontend-dev
volumes_from:
- hermes
command: ["hermes", "-p", "frontend-dev", "gateway", "run"]
Each agent gets its own model, skills, and toolset β essentially a mini AI team.
Troubleshooting
| Problem | Likely Cause | Fix |
|---|---|---|
| Container restart loop | Config error | Check logs: docker compose logs hermes |
| Ollama no GPU | Docker GPU not configured | Add deploy.resources.reservations.devices |
| Gateway can’t connect | Invalid bot token | Regenerate token, check .env |
| “Tool not available” | Toolset not enabled | hermes tools enable terminal file |
| High memory usage | Ollama too large | Use 7B model instead of 13B+ |
Summary
Self-hosting an AI agent with Docker is straightforward and gives you full control over data, costs, and customization. This setup works well for individual developers, freelancers, and small teams.
Next steps:
- Explore Hermes Agent’s skill system β write custom skills for your workflow
- Connect MCP servers (Notion, GitHub, Jira) for external tool access
- Set up cron jobs for automated daily workflows