tutorials

Self-Host AI Agents with Docker: A Step-by-Step Guide Using Hermes Agent

Why Self-Host an AI Agent?

By mid-2026, AI agents like Claude Code, OpenAI Codex CLI, and open-source Hermes Agent have become essential developer tools. While most people use cloud-hosted versions, self-hosting gives you:

  • Data sovereignty β€” your code and prompts never leave your machine
  • Unlimited usage β€” no API quotas when paired with local LLMs
  • Full customization β€” modify the code, add plugins, craft your own prompts
  • Long-term cost savings β€” no monthly SaaS subscriptions for heavy users

This guide walks through deploying Hermes Agent with Docker Compose β€” an open-source, provider-agnostic AI agent framework that can use local models (via Ollama) and cloud APIs (OpenAI, Anthropic, DeepSeek) in a mixed setup.

Prerequisites

  • Server / VPS: Minimum 4GB RAM, 20GB SSD (2GB RAM is enough for cloud-only mode)
  • Docker + Docker Compose installed
  • Basic Linux/command line familiarity

Choosing Your Setup

Mode Pros Cons Best For
Cloud-only (GPT-5/Claude) Fastest setup, strongest models Monthly cost, data leaves your network Quick prototyping
Hybrid (local agent + cloud LLM) Data security + strong models Still pays API fees Production use
Fully local (Ollama + Hermes) Free, maximum privacy Weaker local models R&D, high-security

This guide uses the hybrid approach β€” Hermes Agent runs in Docker, LLM calls go to OpenAI or Anthropic (with Ollama as a local fallback for fast, cheap tasks).

Step 1: Docker Compose Setup

Create docker-compose.yml:

version: '3.8'

services:
  hermes:
    image: nousresearch/hermes-agent:latest
    container_name: hermes-agent
    restart: unless-stopped
    volumes:
      - ./hermes_data:/root/.hermes
      - ./workspace:/workspace
    environment:
      - OPENAI_API_KEY=${O...}
      - ANTHROPIC_API_KEY=${A...}
      - DEEPSEEK_API_KEY=${D...}
      - OLLAMA_HOST=http://ollama:11434
    ports:
      - "8080:8080"
    depends_on:
      ollama:
        condition: service_healthy
    command: ["hermes", "gateway", "run"]

  ollama:
    image: ollama/ollama:latest
    container_name: ollama
    restart: unless-stopped
    volumes:
      - ./ollama_data:/root/.ollama
    ports:
      - "11434:11434"
    healthcheck:
      test: ["CMD", "ollama", "list"]
      interval: 30s
      retries: 3

Step 2: Environment Setup

Create a .env file:

OPENAI_API_KEY=sk-... ANTHROPIC_API_KEY=sk-ant-... DEEPSEEK_API_KEY=sk-...
TZ=Asia/Hong_Kong

At least one LLM provider key is required.

Step 3: Launch

mkdir -p hermes_data workspace ollama_data
docker compose up -d
docker compose logs hermes -f

On first boot you should see:

[2026-06-01 10:00:00] Hermes Agent started successfully
[2026-06-01 10:00:01] Gateway listening on port 8080

Optional: Pull a Local Model

docker exec ollama ollama pull mistral
docker exec ollama ollama pull llama3.1
docker exec ollama ollama run mistral "Hello, how are you?"

Step 4: Configure Hermes

Configure directly inside the container:

docker exec -it hermes-agent hermes config set model.default claude-sonnet-4
docker exec -it hermes-agent hermes config set model.provider anthropic

Key config settings to customize:

model:
  default: claude-sonnet-4
  provider: anthropic

agent:
  max_turns: 90
  tool_use_enforcement: true

terminal:
  backend: docker
  workdir: /workspace

memory:
  memory_enabled: true
  user_profile_enabled: true

delegation:
  model: gpt-5-mini
  max_iterations: 30

Step 5: Connect a Chat Platform (Optional)

Connect Hermes to Telegram, Discord, or Slack:

docker exec -it hermes-agent hermes gateway setup

Choose your platform, enter the bot token or webhook URL, and restart:

docker compose restart hermes

Using Your Agent

Via API

curl -X POST http://localhost:8080/api/chat \
  -H "Content-Type: application/json" \
  -d '{"message": "Create a Flask API with 3 endpoints", "session": "dev-project"}'

Via Telegram

Just send a message to your bot:

/create a FastAPI + SQLite todo list with CRUD endpoints and tests

Via Cron (Scheduled Tasks)

docker exec hermes-agent hermes cron create "every day 9am" \
  --prompt "Check today's GitHub notifications and summarize" \
  --delivery telegram

Advanced: Multi-Agent Setup

Run specialized agent profiles as separate containers:

  agent-frontend:
    image: nousresearch/hermes-agent:latest
    container_name: hermes-frontend
    environment:
      - HERMES_PROFILE=frontend-dev
    volumes_from:
      - hermes
    command: ["hermes", "-p", "frontend-dev", "gateway", "run"]

Each agent gets its own model, skills, and toolset β€” essentially a mini AI team.

Troubleshooting

Problem Likely Cause Fix
Container restart loop Config error Check logs: docker compose logs hermes
Ollama no GPU Docker GPU not configured Add deploy.resources.reservations.devices
Gateway can’t connect Invalid bot token Regenerate token, check .env
“Tool not available” Toolset not enabled hermes tools enable terminal file
High memory usage Ollama too large Use 7B model instead of 13B+

Summary

Self-hosting an AI agent with Docker is straightforward and gives you full control over data, costs, and customization. This setup works well for individual developers, freelancers, and small teams.

Next steps:

  1. Explore Hermes Agent’s skill system β€” write custom skills for your workflow
  2. Connect MCP servers (Notion, GitHub, Jira) for external tool access
  3. Set up cron jobs for automated daily workflows