Skip to main content
Last Updated: March 31, 2026

Introduction

Hermes Agent is an open-source AI agent by Nous Research with persistent memory, 40+ built-in tools, MCP integration, and a self-improvement loop that learns from every task it completes. It supports custom OpenAI-compatible endpoints, making it straightforward to point at a self-hosted model on SaladCloud. Hermes Agent pairs especially well with SaladCloud because:
  • OpenAI-compatible endpoint support - connects to any SaladCloud deployment via the built-in setup wizard
  • Rich MCP integration - connect GitHub, Webflow, Playwright, and any other MCP server
  • Per-hour pricing - no per-token costs during long agentic sessions with memory and multi-step tasks
  • Self-improving - the agent writes structured records of what worked and what didn’t, improving future task performance automatically
Hermes Agent does not support custom HTTP headers for the main LLM provider. SaladCloud uses a Salad-Api-Key header for authentication, which Hermes Agent cannot pass yet. Deploy your SaladCloud container group with authentication disabled when using Hermes Agent or create a custom proxy to handle authentication.

Prerequisites

Before getting started, make sure you have:
  • A SaladCloud account
  • Linux, macOS, or Windows (WSL2)
  • Python 3.10+ and Node.js 18+ (the installer handles these automatically)

Step-by-Step Setup

Step 1: Deploy an LLM Recipe on SaladCloud (Without Authentication)

Deploy an OpenAI-compatible LLM server on SaladCloud with the Authentication toggle disabled.
  • Go to the SaladCloud portal and create an account if you do not already have one.
  • Create an organization or choose an existing one, then click “Deploy a container group”.
  • Select an LLM recipe. For the best experience with Hermes Agent’s tool-calling capabilities, use vLLM with a Hermes-format model, or the Qwen3.5-35B-A3B (llama.cpp) recipe.
  • Leave Authentication disabled - Hermes Agent cannot pass a Salad-Api-Key header at the moment.
  • Once deployed, note your API endpoint URL from SaladCloud (e.g., https://your-endpoint.salad.cloud). The OpenAI-compatible API will be available at {endpoint}/v1.
Available recipes: Ready-to-deploy recipes (best for less technical users): Recipes for custom deployments (best for advanced users):
  • vllm - Recommended; enable Hermes tool calling with --enable-auto-tool-choice --tool-call-parser hermes
  • llama.cpp - Supports GGUF models
  • ollama - Simple model management

Step 2: Install Hermes Agent

Run the official installer - it handles Python, Node.js, dependencies, and the hermes command automatically:
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
Reload your shell, then verify:
source ~/.bashrc   # or ~/.zshrc on macOS
hermes --version

Step 3: Run the Setup Wizard

Run the interactive setup wizard, which handles all model configuration:
hermes setup
When the wizard asks for a model provider, select Custom OpenAI-compatible endpoint and fill in:
  • Base URL: https://your-endpoint.salad.cloud/v1
  • API key: dummy (any non-empty string - the endpoint has no auth)
  • Context length: 262144
The wizard will auto-detect the available models from your endpoint, so you do not need to specify a model name manually. It also configures memory, tools, terminal backend, and optional messaging channels - you can accept defaults for everything and adjust later via hermes config edit, which opens ~/.hermes/config.yaml.
Hermes Agent can import existing OpenClaw settings automatically if you have previously configured it.

Step 4: Verify Your Configuration

Before testing, confirm your setup by checking the configuration file:
hermes config show
Verify that:
  • model.base_url points to your SaladCloud endpoint with /v1 appended
  • model.api_key is set to any non-empty value (e.g., dummy)
  • model.default is detected correctly from your endpoint
If anything looks incorrect, edit the config:
hermes config edit

Step 5: Test the Connection

Launch Hermes Agent:
hermes chat
Test with a simple task:
“Summarize what SaladCloud is in two sentences.”
If Hermes Agent responds, your setup is complete.

Built-in Tools

Hermes Agent ships with 40+ built-in tools out of the box - no MCP servers or plugins required for most common tasks.
ToolWhat it does
web_searchSearch the web and return results
web_extractExtract and parse content from any URL
browser_navigateControl a real browser to navigate pages
browser_snapshotCapture the current state of a browser page
browser_visionInteract with pages using vision (click, read, etc.)

Files & Terminal

ToolWhat it does
terminalExecute shell commands
processStart and manage background processes
read_fileRead file contents
patchModify files with targeted edits
execute_codeRun code in an isolated environment

Media & Voice

ToolWhat it does
text_to_speechConvert text to spoken audio
vision_analyzeAnalyze and describe images
image_generateCreate images via an image generation model

Memory & Tasks

ToolWhat it does
memoryRead and write persistent memory across sessions
session_searchSearch past conversation sessions
todoCreate and manage task lists
cronjobSchedule recurring tasks
clarifyAsk the user a clarifying question before proceeding

Agent Orchestration

ToolWhat it does
delegate_taskSpin up a subagent to handle a subtask in parallel
send_messageSend a message to a connected messaging channel
Most tools are enabled by default. You can enable or disable individual tools with:
hermes tools

MCP Server Integration

Hermes Agent has first-class MCP support. Add any MCP server to ~/.hermes/config.yaml under mcp_servers. More details can be found in the official MCP documentation

Filesystem MCP

mcp_servers:
  filesystem:
    command: 'npx'
    args: ['-y', '@modelcontextprotocol/server-filesystem', '/path/to/your/project']

GitHub MCP

mcp_servers:
  github:
    command: 'npx'
    args: ['-y', '@modelcontextprotocol/server-github']
    env:
      GITHUB_PERSONAL_ACCESS_TOKEN: your-github-token

Playwright MCP (Browser Automation)

mcp_servers:
  playwright:
    command: 'npx'
    args: ['-y', '@playwright/mcp@latest']

Webflow MCP

mcp_servers:
  webflow:
    command: 'npx'
    args: ['-y', 'webflow-mcp-server@latest']
    env:
      WEBFLOW_TOKEN: your-webflow-token

Remote HTTP MCP Server

mcp_servers:
  remote_api:
    url: 'https://mcp.example.com/mcp'
    headers:
      Authorization: 'Bearer your-token'
After editing the config, reload MCP servers without restarting:
/reload-mcp
You can also filter which tools each server exposes:
mcp_servers:
  github:
    command: 'npx'
    args: ['-y', '@modelcontextprotocol/server-github']
    env:
      GITHUB_PERSONAL_ACCESS_TOKEN: your-token
    tools:
      include: [create_issue, list_issues, get_file_contents]

Configure Memory

Hermes Agent’s persistent memory is one of its strongest features. It works out of the box but can be tuned:
memory:
  memory_enabled: true
  memory_char_limit: 2200 # ~800 tokens
Increase memory_char_limit if the agent is working on large projects and needs to retain more context between sessions.

Use Docker for Tool Execution

For safer code execution and better isolation, set the terminal backend to Docker:
terminal:
  backend: docker
  docker_image: 'nikolaik/python-nodejs:python3.11-nodejs20'

Model Recommendations

  • Qwen 3.5-35B-A3B with llama.cpp: Good for conversational tasks and simpler agentic workflows
  • Qwen 3.5-9B: Suitable for lightweight tasks; less reliable for complex multi-step tool chains