Documentation Index
Fetch the complete documentation index at: https://docs.salad.com/llms.txt
Use this file to discover all available pages before exploring further.
Last Updated: May 18, 2026
Introduction
Hermes Agent is an open-source AI agent by
Nous Research with persistent memory, 40+ built-in tools, MCP integration, and a
self-improvement loop that learns from every task it completes. It supports custom OpenAI-compatible endpoints, making
it straightforward to point at a model on SaladCloud.
Hermes Agent pairs especially well with SaladCloud because:
- OpenAI-compatible endpoint support - connects to any SaladCloud deployment via the built-in setup wizard
- Rich MCP integration - connect GitHub, Webflow, Playwright, and any other MCP server
- Self-improving - the agent writes structured records of what worked and what didn’t, improving future task
performance automatically
Hermes Agent works with SaladCloud in two ways:
- Salad AI Gateway - no infrastructure to deploy or manage. Sign up for access,
use your Salad API key with standard authentication. Currently in closed beta with monthly flat-rate access.
- Self-hosted model - deploy your own LLM recipe on SaladCloud Container Engine, billed per hour. Note: Hermes Agent
cannot pass custom headers, so authentication must be disabled on self-hosted deployments.
Prerequisites
Before getting started, make sure you have:
- A SaladCloud account
- Linux, macOS, or Windows (WSL2)
- Python 3.10+ and Node.js 18+ (the installer handles these automatically)
Step-by-Step Setup
Step 1: Choose Your Backend
Salad AI Gateway is the recommended option for Hermes Agent - it uses standard Bearer token authentication, which
Hermes Agent supports natively.
- Sign up for early access at salad.com/ai-gateway.
- Once approved, find your Salad API key in the portal.
Available models:| Model | Description |
|---|
qwen3.6-35b-a3b | Qwen 3.6 35B-A3B - best for agentic tasks, coding, and complex reasoning |
qwen3.6-27b | Qwen 3.6 27B - strong balance of capability and speed |
qwen3.5-9b | Qwen 3.5 9B - fastest response times, suited for lighter tasks |
Deploy an OpenAI-compatible LLM server on SaladCloud with the Authentication toggle disabled.Hermes Agent does not support custom HTTP headers for the main LLM provider. SaladCloud uses a Salad-Api-Key
header for authentication, which Hermes Agent cannot pass yet. Deploy your SaladCloud container group with
authentication disabled when using Hermes Agent or create a custom proxy to handle authentication.
- Go to the SaladCloud portal and create an account if you do not already have one.
- Create an organization or choose an existing one, then click “Deploy a container group”.
- Select an LLM recipe. For the best experience with Hermes Agent’s tool-calling capabilities, use
vLLM with a Hermes-format model, or the
Qwen3.6-35B-A3B (llama.cpp) recipe.
- Leave Authentication disabled - Hermes Agent cannot pass a
Salad-Api-Key header at the moment.
- Once deployed, note your API endpoint URL from SaladCloud (e.g.,
https://your-endpoint.salad.cloud). The
OpenAI-compatible API will be available at {endpoint}/v1.
Available recipes:Ready-to-deploy recipes (best for less technical users):Recipes for custom deployments (best for advanced users):
- vllm - Recommended; enable Hermes tool calling with
--enable-auto-tool-choice --tool-call-parser hermes
- llama.cpp - Supports GGUF models
- ollama - Simple model management
Step 2: Install Hermes Agent
Run the official installer - it handles Python, Node.js, dependencies, and the hermes command automatically:
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
Reload your shell, then verify:
source ~/.bashrc # or ~/.zshrc on macOS
hermes --version
Step 3: Run the Setup Wizard
Run the interactive setup wizard, which handles all model configuration:
When the wizard asks for a model provider, select Custom OpenAI-compatible endpoint and fill in:
- Base URL:
https://ai.salad.cloud/v1
- API key: your Salad API key
- Context length:
262144
The wizard will auto-detect the available models from your endpoint.
- Base URL:
https://your-endpoint.salad.cloud/v1
- API key:
dummy (any non-empty string - the endpoint has no auth)
- Context length:
262144
The wizard will auto-detect the available models from your endpoint, so you do not need to specify a model name
manually.
It also configures memory, tools, terminal backend, and optional messaging channels - you can accept defaults for
everything and adjust later via hermes config edit, which opens ~/.hermes/config.yaml.
Hermes Agent can import existing OpenClaw settings automatically if you have previously configured it.
Step 4: Verify Your Configuration
Before testing, confirm your setup by checking the configuration file:
Verify that:
model.base_url points to your endpoint with /v1 appended
model.api_key is set to your Salad API key (AI Gateway) or any non-empty value (self-hosted)
model.default is detected correctly from your endpoint
If anything looks incorrect, edit the config:
Step 5: Test the Connection
Launch Hermes Agent:
Test with a simple task:
“Summarize what SaladCloud is in two sentences.”
If Hermes Agent responds, your setup is complete.
Hermes Agent ships with 40+ built-in tools out of the box - no MCP servers or plugins required for most common tasks.
Web & Search
| Tool | What it does |
|---|
web_search | Search the web and return results |
web_extract | Extract and parse content from any URL |
browser_navigate | Control a real browser to navigate pages |
browser_snapshot | Capture the current state of a browser page |
browser_vision | Interact with pages using vision (click, read, etc.) |
Files & Terminal
| Tool | What it does |
|---|
terminal | Execute shell commands |
process | Start and manage background processes |
read_file | Read file contents |
patch | Modify files with targeted edits |
execute_code | Run code in an isolated environment |
| Tool | What it does |
|---|
text_to_speech | Convert text to spoken audio |
vision_analyze | Analyze and describe images |
image_generate | Create images via an image generation model |
Memory & Tasks
| Tool | What it does |
|---|
memory | Read and write persistent memory across sessions |
session_search | Search past conversation sessions |
todo | Create and manage task lists |
cronjob | Schedule recurring tasks |
clarify | Ask the user a clarifying question before proceeding |
Agent Orchestration
| Tool | What it does |
|---|
delegate_task | Spin up a subagent to handle a subtask in parallel |
send_message | Send a message to a connected messaging channel |
Most tools are enabled by default. You can enable or disable individual tools with:
MCP Server Integration
Hermes Agent has first-class MCP support. Add any MCP server to ~/.hermes/config.yaml under mcp_servers. More
details can be found in the official
MCP documentation
Filesystem MCP
mcp_servers:
filesystem:
command: 'npx'
args: ['-y', '@modelcontextprotocol/server-filesystem', '/path/to/your/project']
GitHub MCP
mcp_servers:
github:
command: 'npx'
args: ['-y', '@modelcontextprotocol/server-github']
env:
GITHUB_PERSONAL_ACCESS_TOKEN: your-github-token
Playwright MCP (Browser Automation)
mcp_servers:
playwright:
command: 'npx'
args: ['-y', '@playwright/mcp@latest']
Webflow MCP
mcp_servers:
webflow:
command: 'npx'
args: ['-y', 'webflow-mcp-server@latest']
env:
WEBFLOW_TOKEN: your-webflow-token
Remote HTTP MCP Server
mcp_servers:
remote_api:
url: 'https://mcp.example.com/mcp'
headers:
Authorization: 'Bearer your-token'
After editing the config, reload MCP servers without restarting:
You can also filter which tools each server exposes:
mcp_servers:
github:
command: 'npx'
args: ['-y', '@modelcontextprotocol/server-github']
env:
GITHUB_PERSONAL_ACCESS_TOKEN: your-token
tools:
include: [create_issue, list_issues, get_file_contents]
Hermes Agent’s persistent memory is one of its strongest features. It works out of the box but can be tuned:
memory:
memory_enabled: true
memory_char_limit: 2200 # ~800 tokens
Increase memory_char_limit if the agent is working on large projects and needs to retain more context between
sessions.
For safer code execution and better isolation, set the terminal backend to Docker:
terminal:
backend: docker
docker_image: 'nikolaik/python-nodejs:python3.11-nodejs20'
Model Recommendations
- Qwen 3.6-35B-A3B with llama.cpp: Good for conversational tasks and simpler agentic workflows
- Qwen 3.5-9B: Suitable for lightweight tasks; less reliable for complex multi-step tool chains