Last Updated: March 31, 2026
Introduction
Hermes Agent is an open-source AI agent by
Nous Research with persistent memory, 40+ built-in tools, MCP integration, and a
self-improvement loop that learns from every task it completes. It supports custom OpenAI-compatible endpoints, making
it straightforward to point at a self-hosted model on SaladCloud.
Hermes Agent pairs especially well with SaladCloud because:
- OpenAI-compatible endpoint support - connects to any SaladCloud deployment via the built-in setup wizard
- Rich MCP integration - connect GitHub, Webflow, Playwright, and any other MCP server
- Per-hour pricing - no per-token costs during long agentic sessions with memory and multi-step tasks
- Self-improving - the agent writes structured records of what worked and what didn’t, improving future task
performance automatically
Hermes Agent does not support custom HTTP headers for the main LLM provider. SaladCloud uses a Salad-Api-Key header
for authentication, which Hermes Agent cannot pass yet. Deploy your SaladCloud container group with authentication
disabled when using Hermes Agent or create a custom proxy to handle authentication.
Prerequisites
Before getting started, make sure you have:
- A SaladCloud account
- Linux, macOS, or Windows (WSL2)
- Python 3.10+ and Node.js 18+ (the installer handles these automatically)
Step-by-Step Setup
Step 1: Deploy an LLM Recipe on SaladCloud (Without Authentication)
Deploy an OpenAI-compatible LLM server on SaladCloud with the Authentication toggle disabled.
- Go to the SaladCloud portal and create an account if you do not already have one.
- Create an organization or choose an existing one, then click “Deploy a container group”.
- Select an LLM recipe. For the best experience with Hermes Agent’s tool-calling capabilities, use
vLLM with a Hermes-format model, or the
Qwen3.5-35B-A3B (llama.cpp) recipe.
- Leave Authentication disabled - Hermes Agent cannot pass a
Salad-Api-Key header at the moment.
- Once deployed, note your API endpoint URL from SaladCloud (e.g.,
https://your-endpoint.salad.cloud). The
OpenAI-compatible API will be available at {endpoint}/v1.
Available recipes:
Ready-to-deploy recipes (best for less technical users):
Recipes for custom deployments (best for advanced users):
- vllm - Recommended; enable Hermes tool calling with
--enable-auto-tool-choice --tool-call-parser hermes
- llama.cpp - Supports GGUF models
- ollama - Simple model management
Step 2: Install Hermes Agent
Run the official installer - it handles Python, Node.js, dependencies, and the hermes command automatically:
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
Reload your shell, then verify:
source ~/.bashrc # or ~/.zshrc on macOS
hermes --version
Step 3: Run the Setup Wizard
Run the interactive setup wizard, which handles all model configuration:
When the wizard asks for a model provider, select Custom OpenAI-compatible endpoint and fill in:
- Base URL:
https://your-endpoint.salad.cloud/v1
- API key:
dummy (any non-empty string - the endpoint has no auth)
- Context length:
262144
The wizard will auto-detect the available models from your endpoint, so you do not need to specify a model name
manually. It also configures memory, tools, terminal backend, and optional messaging channels - you can accept defaults
for everything and adjust later via hermes config edit, which opens ~/.hermes/config.yaml.
Hermes Agent can import existing OpenClaw settings automatically if you have previously configured it.
Step 4: Verify Your Configuration
Before testing, confirm your setup by checking the configuration file:
Verify that:
model.base_url points to your SaladCloud endpoint with /v1 appended
model.api_key is set to any non-empty value (e.g., dummy)
model.default is detected correctly from your endpoint
If anything looks incorrect, edit the config:
Step 5: Test the Connection
Launch Hermes Agent:
Test with a simple task:
“Summarize what SaladCloud is in two sentences.”
If Hermes Agent responds, your setup is complete.
Hermes Agent ships with 40+ built-in tools out of the box - no MCP servers or plugins required for most common tasks.
Web & Search
| Tool | What it does |
|---|
web_search | Search the web and return results |
web_extract | Extract and parse content from any URL |
browser_navigate | Control a real browser to navigate pages |
browser_snapshot | Capture the current state of a browser page |
browser_vision | Interact with pages using vision (click, read, etc.) |
Files & Terminal
| Tool | What it does |
|---|
terminal | Execute shell commands |
process | Start and manage background processes |
read_file | Read file contents |
patch | Modify files with targeted edits |
execute_code | Run code in an isolated environment |
| Tool | What it does |
|---|
text_to_speech | Convert text to spoken audio |
vision_analyze | Analyze and describe images |
image_generate | Create images via an image generation model |
Memory & Tasks
| Tool | What it does |
|---|
memory | Read and write persistent memory across sessions |
session_search | Search past conversation sessions |
todo | Create and manage task lists |
cronjob | Schedule recurring tasks |
clarify | Ask the user a clarifying question before proceeding |
Agent Orchestration
| Tool | What it does |
|---|
delegate_task | Spin up a subagent to handle a subtask in parallel |
send_message | Send a message to a connected messaging channel |
Most tools are enabled by default. You can enable or disable individual tools with:
MCP Server Integration
Hermes Agent has first-class MCP support. Add any MCP server to ~/.hermes/config.yaml under mcp_servers. More
details can be found in the official
MCP documentation
Filesystem MCP
mcp_servers:
filesystem:
command: 'npx'
args: ['-y', '@modelcontextprotocol/server-filesystem', '/path/to/your/project']
GitHub MCP
mcp_servers:
github:
command: 'npx'
args: ['-y', '@modelcontextprotocol/server-github']
env:
GITHUB_PERSONAL_ACCESS_TOKEN: your-github-token
Playwright MCP (Browser Automation)
mcp_servers:
playwright:
command: 'npx'
args: ['-y', '@playwright/mcp@latest']
Webflow MCP
mcp_servers:
webflow:
command: 'npx'
args: ['-y', 'webflow-mcp-server@latest']
env:
WEBFLOW_TOKEN: your-webflow-token
Remote HTTP MCP Server
mcp_servers:
remote_api:
url: 'https://mcp.example.com/mcp'
headers:
Authorization: 'Bearer your-token'
After editing the config, reload MCP servers without restarting:
You can also filter which tools each server exposes:
mcp_servers:
github:
command: 'npx'
args: ['-y', '@modelcontextprotocol/server-github']
env:
GITHUB_PERSONAL_ACCESS_TOKEN: your-token
tools:
include: [create_issue, list_issues, get_file_contents]
Hermes Agent’s persistent memory is one of its strongest features. It works out of the box but can be tuned:
memory:
memory_enabled: true
memory_char_limit: 2200 # ~800 tokens
Increase memory_char_limit if the agent is working on large projects and needs to retain more context between
sessions.
For safer code execution and better isolation, set the terminal backend to Docker:
terminal:
backend: docker
docker_image: 'nikolaik/python-nodejs:python3.11-nodejs20'
Model Recommendations
- Qwen 3.5-35B-A3B with llama.cpp: Good for conversational tasks and simpler agentic workflows
- Qwen 3.5-9B: Suitable for lightweight tasks; less reliable for complex multi-step tool chains