Hermes Agent plus Claude MCP: how the delegation loop works and what it handles well

Video: "Hermes Agent + Claude MCP is INSANE (FREE!)" by Julian Goldie on YouTube.

What MCP actually does in this context

The Model Context Protocol is a standard that lets an AI model call out to external tools and services during a conversation rather than just generating text. In most Claude setups, MCP means things like web search, file access, or database queries — tools that extend what the model can read and write within a single session. Connecting Claude to Hermes Agent via MCP is different: instead of calling a database or a search engine, Claude is calling another AI agent.

That shift matters. It means Claude can receive a high-level goal from a user, decompose it into subtasks, and hand those subtasks to Hermes — which then executes them using its own memory, skills, and tools, and returns results. Claude stays in the loop as the planning layer; Hermes handles the execution. Neither does the other's job well alone. Together they cover considerably more ground.

How the delegation loop works

The setup Goldie demonstrated works in three steps. First, Claude receives a goal — something like "research competitors in this space and summarise their positioning". Second, Claude calls the Hermes MCP server with a structured task description, including any relevant context and the format it expects back. Third, Hermes takes the task, runs it using its skill library and tool access, and returns results to Claude, which then synthesises them into a final response.

The important detail is that Hermes is not just fetching data — it is running its own multi-step workflow within each delegated task. A research brief might trigger Hermes to search several sources, cross-reference what it finds against its memory, and produce a structured summary before returning anything to Claude. The delegation loop adds Hermes's depth of execution to Claude's reasoning, without either tool needing to be rebuilt to accommodate the other.

What tasks it handles well

The combination is most useful on tasks that require both planning and sustained execution. Content research and drafting workflows — where Claude structures the brief and Hermes does the research before Claude writes — have shown consistent results. Data gathering tasks where the scope is clear but the execution involves multiple steps work well. Repeated workflows where Hermes's skill memory means the second and third run are faster and more accurate than the first are a natural fit.

Where it is less suited: tasks requiring real-time data where Hermes's retrieval tools introduce latency, anything requiring tight back-and-forth with a user in a conversational tone, and tasks where Claude's reasoning needs to adapt moment to moment based on partial results. The delegation model assumes Hermes can complete a task and return a result — it does not handle well the cases where that loop needs to be interrupted and redirected mid-run.

What to know before setting it up

The MCP server that connects Claude to Hermes is a separate component that needs to be running alongside both tools. Goldie's walkthrough covers the configuration, but there is some technical familiarity required: you are essentially running three things (Claude, the MCP server, and Hermes) and making sure they can talk to each other over the correct ports. Once it is working, the integration is stable — the overhead is front-loaded into setup, not ongoing operation.

Worth noting: this is a free setup using open-source components, so the only ongoing cost is whatever model you are running Hermes against. Claude itself is the exception — you will need an API key, though the volume of calls Hermes generates is typically modest compared to a direct conversational session.

Where this connects to NordSys

We configure Hermes Agent with MCP integrations as part of our AI Agents service — which means Claude-to-Hermes delegation is something we can set up and test in a working environment rather than leaving you to debug three interconnected tools from scratch. If you want this kind of setup running for your business without the configuration overhead, see our AI Agents service.

See our AI Agents service →