Text your own AI assistant on WhatsApp: Hermes wired to FreeLLMAPI
Point Hermes Agent at a FreeLLMAPI backend and connect it to WhatsApp, so a memory-keeping assistant runs 24/7 on a free always-on server and costs nothing per message, with the wiring validated before you link a number.
Run this workflow
CI-verified, 2/2 fixtures passing.
Build this with your agent
One copy-paste hands Claude Code, Codex, or Cursor the full recipe, steps included, nothing to fetch.
Intended Use
One person building a personal texting assistant. CI validates the Hermes wiring config: the model base_url points at the FreeLLMAPI endpoint (:3001/v1), the chosen model declares at least 64000 tokens of context (Hermes's documented floor), the gateway platform is supported, and a WhatsApp gateway is set to a spare (not main) number. The install, number-linking, and 24/7 run are fenced.
Not for
- Your main WhatsApp number, automated bots on a personal account carry a real ban risk, so the config rejects a main number
- A managed or production assistant, the model backend is personal-experimentation-only and degrades late in the day
- A one-click setup, expect a terminal weekend across the server, the backend, and the gateway
The Stack
Tested Against
Hermes Agent (Nous Research, 2026-06)tashfeenahmed/freellmapi (2026-06)node@20Side effects & data flow
- Network
- none, local only
- Writes
- ./hermes-wiring.json
- Credentials
- none required
Prerequisites
- An always-on server (e.g. Oracle Cloud Always Free ARM: 2 cores / 12 GB)
- A running FreeLLMAPI backend (see the FreeLLMAPI recipe)
- A spare phone number for WhatsApp
Steps
- 1
Author the Hermes wiring config and validate it
Capture how Hermes will be wired: a custom model endpoint at your FreeLLMAPI backend, a model whose context meets Hermes's 64k floor, and a WhatsApp gateway on a spare number. CI checks the wiring; the hermes install, hermes model picker, and gateway linking are fenced.
cat > hermes-wiring.json <<'JSON' { "model": { "provider": "custom", "base_url": "http://localhost:3001/v1", "api_key": "freellmapi-REPLACE_WITH_YOUR_KEY", "model": "auto", "context_window": 65536 }, "gateway": { "platform": "whatsapp", "number": "spare" } } JSON node -e ' const fs = require("fs"); const c = JSON.parse(fs.readFileSync("hermes-wiring.json", "utf8")); function bad(m) { console.error("BAD: " + m); process.exit(1); } const model = c.model || {}; if (!model.base_url || model.base_url.indexOf(":3001/v1") === -1) bad("model.base_url must point at the FreeLLMAPI endpoint :3001/v1"); const ctx = model.context_window || 0; if (ctx < 64000) bad("model.context_window must be at least 64000 (Hermes floor); pick a larger-context free model"); const gw = c.gateway || {}; const platforms = ["whatsapp", "telegram", "signal", "discord", "slack", "email", "cli"]; if (platforms.indexOf(gw.platform) === -1) bad("gateway.platform must be a supported Hermes platform"); if (gw.platform === "whatsapp" && gw.number === "main") bad("use a spare WhatsApp number, never your main one"); console.log("wiring OK: Hermes -> FreeLLMAPI (:3001/v1), context " + ctx + " >= 64000, gateway " + gw.platform + " on a " + gw.number + " number"); ' - 2
Build the stack and link WhatsApp (the steps CI does not run)
On the server: install Hermes (curl install), run hermes model and pick the Custom Endpoint with your FreeLLMAPI base URL and key, verify a plain `hermes` terminal chat works, then hermes gateway setup -> WhatsApp to link the spare number, and keep hermes gateway alive under systemd/pm2. Do not add WhatsApp on top of a broken chat. All of this is fenced.
Eval, 2 fixtures
Last passed: verified todaywiring-okcontainstimeout 30s · max $0Expected:
wiring OK: Hermes -> FreeLLMAPI (:3001/v1), context 65536 >= 64000, gateway whatsapp on a spare numberclean-exitexit_codetimeout 30s · max $0Expected:
0
Results
Hermes (Nous Research) is the brain: it keeps memory and connects to WhatsApp, Telegram, and Signal out of the box. Pointed at a FreeLLMAPI custom endpoint, the model cost drops to zero. The whole stack fits on an Oracle Cloud Always Free ARM VM. Hermes requires a large-context model (its docs state a 64k floor), so this recipe enforces that in the wiring. It is a weekend project, not a one-click install, and WhatsApp does not officially welcome bots on a personal account, so use a spare number.
Did this work for you?
Our CI checks the setup runs. You tell us if the whole thing worked. Tell us straight.
Related workflows
- FreeLLMAPI: one socket, sixteen free model tiers with auto-fallback
- Flue: define a sandboxed headless agent and deploy it anywhere
- Build the Fugu pattern in the open: fan out, assign roles, verify
- Sakana Fugu: A/B it on your own task before you migrate
- Run GLM-5.2 fully local on a Mac Studio and drive it with Hermes
- Eve: make evals the deploy gate, not a vibe check
Liked this workflow?
Get new verified workflows in WebAfterAI, three issues a week (Tue, Thu, Sat).