Three AI agents, one stack, no bills: running Claude Code, OpenClaw and Hermes Agent for free

Video: "Run Claude Code, OpenClaw & Hermes Agent Completely Free | Unlimited API" by Julian Goldie on YouTube.

What these three tools are and why running them together matters

Claude Code is Anthropic's terminal-native coding agent. It sits in your project folder, reads your codebase, and handles tasks like writing functions, debugging, and running tests — all from the command line. It is good at sustained, in-depth work on real codebases and generally better than browser-based alternatives at keeping context across a long session.

OpenClaw is a self-hosted open-source agent that works through messaging apps — Telegram, WhatsApp, Discord. It has a marketplace (ClawHub) with over 3,200 skills covering everything from web scraping to document processing. The appeal is that it lives in apps your team already has open, so it does not require anyone to switch tools to use it.

Hermes Agent, from Nous Research, is built around persistent memory and self-improving skills. It stores what it learns across sessions in a structured format and revisits that knowledge on future runs, which makes it noticeably better at repetitive tasks after a few cycles. It is closest to a long-running background worker: give it a goal and leave it running rather than prompting it back and forth.

Running all three together gives you different tool for different job types. Claude Code for technical work. OpenClaw for anything that flows through messaging. Hermes for ongoing, goal-driven tasks. The question has always been cost — each needs a model, and models cost money.

The free model layer that changes the equation

OpenRouter is a service that aggregates model APIs from dozens of providers behind a single endpoint. Critically, it hosts several genuinely free models alongside paid ones. The free tier is not a marketing trick — it is real usage at no cost, funded by the providers as a way to drive adoption.

Models like Owl Alpha (a 1-million-token context model with no listed provider) and GLM 4.7 Flash (from Zhipu AI) are both available on OpenRouter at no charge. They are not as capable as Claude Sonnet or GPT-4o, but for many agent tasks — especially routine automation, summarisation, and structured data work — the gap matters less than you might expect.

The setup Goldie demonstrates routes all three agents through OpenRouter using a single API key. Each tool supports OpenRouter as a backend (they accept the OpenAI-compatible API format that OpenRouter exposes), so switching is a configuration change rather than a code change.

What the free tier is good for and where it falls short

In practice, free models perform well on tasks that are well-defined and self-contained: pulling data from a page, drafting a document to a template, running a scheduled job, summarising a conversation. Where they noticeably struggle is multi-step reasoning with ambiguous instructions, code that needs precise architectural judgement, and anything where a slightly wrong output cascades into a bigger problem downstream.

Worth knowing: context limits matter more on agents than on chatbots. An agent running a long job accumulates context quickly. Owl Alpha's 1-million-token window is a genuine advantage for Hermes specifically, because Hermes keeps session state and skill files in context across runs. On a model with a 32K limit you would start seeing truncation on complex tasks.

For a small team testing AI agents before committing to a paid setup, the free stack is a sensible starting point. You get to understand where the limits are on real work rather than on demos. When you hit the ceiling — and you probably will — you switch one agent at a time to a paid model rather than rearchitecting everything.

The mobile and cloud angle

The video also covers running the stack on a cloud server (a £4/month VPS is enough for Hermes) and accessing it from a phone. OpenClaw already works in Telegram, so mobile access is built in for that one. Claude Code via a remote terminal is workable if less comfortable. The point is that the whole setup can run headlessly on a small server rather than requiring a laptop to stay open and connected.

That matters for Hermes in particular. Its goal-locking feature means it keeps working after you disconnect. Set a target, close the terminal, come back later. On a local machine that means leaving your computer on. On a £4 VPS it means nothing more than a monthly bill that costs less than a coffee.

Where this connects to NordSys

We set up and configure AI agent stacks for clients — including deciding which tools belong in which part of a workflow and what model tier actually makes sense for each job. A free model on OpenRouter might be all you need for inbox summarisation; it almost certainly is not enough for complex multi-step code work. Getting that configuration right from the start saves a lot of trial and error. See our AI Agents service if you want a proper setup rather than a YouTube demo.

See our AI Agents service →