Self-Host AI Agents with Docker: A Step-by-Step Guide Using Hermes Agent · AgentFlow HK

Why Self-Host an AI Agent?

By mid-2026, AI agents like Claude Code, OpenAI Codex CLI, and open-source Hermes Agent have become essential developer tools. While most people use cloud-hosted versions, self-hosting gives you:

Data sovereignty — your code and prompts never leave your machine
Unlimited usage — no API quotas when paired with local LLMs
Full customization — modify the code, add plugins, craft your own prompts
Long-term cost savings — no monthly SaaS subscriptions for heavy users

This guide walks through deploying Hermes Agent with Docker Compose — an open-source, provider-agnostic AI agent framework that can use local models (via Ollama) and cloud APIs (OpenAI, Anthropic, DeepSeek) in a mixed setup.

Prerequisites

Server / VPS: Minimum 4GB RAM, 20GB SSD (2GB RAM is enough for cloud-only mode)
Docker + Docker Compose installed
Basic Linux/command line familiarity

Choosing Your Setup

Mode	Pros	Cons	Best For
Cloud-only (GPT-5/Claude)	Fastest setup, strongest models	Monthly cost, data leaves your network	Quick prototyping
Hybrid (local agent + cloud LLM)	Data security + strong models	Still pays API fees	Production use
Fully local (Ollama + Hermes)	Free, maximum privacy	Weaker local models	R&D, high-security

This guide uses the hybrid approach — Hermes Agent runs in Docker, LLM calls go to OpenAI or Anthropic (with Ollama as a local fallback for fast, cheap tasks).

Step 1: Docker Compose Setup

Create docker-compose.yml:

version: '3.8'

services:
  hermes:
    image: nousresearch/hermes-agent:latest
    container_name: hermes-agent
    restart: unless-stopped
    volumes:
      - ./hermes_data:/root/.hermes
      - ./workspace:/workspace
    environment:
      - OPENAI_API_KEY=${O...}
      - ANTHROPIC_API_KEY=${A...}
      - DEEPSEEK_API_KEY=${D...}
      - OLLAMA_HOST=http://ollama:11434
    ports:
      - "8080:8080"
    depends_on:
      ollama:
        condition: service_healthy
    command: ["hermes", "gateway", "run"]

  ollama:
    image: ollama/ollama:latest
    container_name: ollama
    restart: unless-stopped
    volumes:
      - ./ollama_data:/root/.ollama
    ports:
      - "11434:11434"
    healthcheck:
      test: ["CMD", "ollama", "list"]
      interval: 30s
      retries: 3

Step 2: Environment Setup

Create a .env file:

OPENAI_API_KEY=sk-... ANTHROPIC_API_KEY=sk-ant-... DEEPSEEK_API_KEY=sk-...
TZ=Asia/Hong_Kong

At least one LLM provider key is required.

Step 3: Launch

mkdir -p hermes_data workspace ollama_data
docker compose up -d
docker compose logs hermes -f

On first boot you should see:

[2026-06-01 10:00:00] Hermes Agent started successfully
[2026-06-01 10:00:01] Gateway listening on port 8080

Optional: Pull a Local Model

docker exec ollama ollama pull mistral
docker exec ollama ollama pull llama3.1
docker exec ollama ollama run mistral "Hello, how are you?"

Step 4: Configure Hermes

Configure directly inside the container:

docker exec -it hermes-agent hermes config set model.default claude-sonnet-4
docker exec -it hermes-agent hermes config set model.provider anthropic

Key config settings to customize:

model:
  default: claude-sonnet-4
  provider: anthropic

agent:
  max_turns: 90
  tool_use_enforcement: true

terminal:
  backend: docker
  workdir: /workspace

memory:
  memory_enabled: true
  user_profile_enabled: true

delegation:
  model: gpt-5-mini
  max_iterations: 30

Step 5: Connect a Chat Platform (Optional)

Connect Hermes to Telegram, Discord, or Slack:

docker exec -it hermes-agent hermes gateway setup

Choose your platform, enter the bot token or webhook URL, and restart:

docker compose restart hermes

Using Your Agent

Via API

curl -X POST http://localhost:8080/api/chat \
  -H "Content-Type: application/json" \
  -d '{"message": "Create a Flask API with 3 endpoints", "session": "dev-project"}'

Via Telegram

Just send a message to your bot:

/create a FastAPI + SQLite todo list with CRUD endpoints and tests

Via Cron (Scheduled Tasks)

docker exec hermes-agent hermes cron create "every day 9am" \
  --prompt "Check today's GitHub notifications and summarize" \
  --delivery telegram

Advanced: Multi-Agent Setup

Run specialized agent profiles as separate containers:

  agent-frontend:
    image: nousresearch/hermes-agent:latest
    container_name: hermes-frontend
    environment:
      - HERMES_PROFILE=frontend-dev
    volumes_from:
      - hermes
    command: ["hermes", "-p", "frontend-dev", "gateway", "run"]

Each agent gets its own model, skills, and toolset — essentially a mini AI team.

Troubleshooting

Problem	Likely Cause	Fix
Container restart loop	Config error	Check logs: `docker compose logs hermes`
Ollama no GPU	Docker GPU not configured	Add `deploy.resources.reservations.devices`
Gateway can’t connect	Invalid bot token	Regenerate token, check `.env`
“Tool not available”	Toolset not enabled	`hermes tools enable terminal file`
High memory usage	Ollama too large	Use 7B model instead of 13B+

Summary

Self-hosting an AI agent with Docker is straightforward and gives you full control over data, costs, and customization. This setup works well for individual developers, freelancers, and small teams.

Next steps:

Explore Hermes Agent’s skill system — write custom skills for your workflow
Connect MCP servers (Notion, GitHub, Jira) for external tool access
Set up cron jobs for automated daily workflows