AgentsOpen SourceFreeActiveMachine-verified· advanced · ~90 min setup

Text your own AI assistant on WhatsApp: Hermes wired to FreeLLMAPI

Point Hermes Agent at a FreeLLMAPI backend and connect it to WhatsApp, so a memory-keeping assistant runs 24/7 on a free always-on server and costs nothing per message, with the wiring validated before you link a number.

by Shilpa Mitra· verified today· v1.0.0

Run this workflow

CI-verified, 2/2 fixtures passing.

Build this with your agent

One copy-paste hands Claude Code, Codex, or Cursor the full recipe, steps included, nothing to fetch.

Intended Use

One person building a personal texting assistant. CI validates the Hermes wiring config: the model base_url points at the FreeLLMAPI endpoint (:3001/v1), the chosen model declares at least 64000 tokens of context (Hermes's documented floor), the gateway platform is supported, and a WhatsApp gateway is set to a spare (not main) number. The install, number-linking, and 24/7 run are fenced.

Not for

  • Your main WhatsApp number, automated bots on a personal account carry a real ban risk, so the config rejects a main number
  • A managed or production assistant, the model backend is personal-experimentation-only and degrades late in the day
  • A one-click setup, expect a terminal weekend across the server, the backend, and the gateway

The Stack

Tested Against

Hermes Agent (Nous Research, 2026-06)tashfeenahmed/freellmapi (2026-06)node@20

Side effects & data flow

Network
none, local only
Writes
./hermes-wiring.json
Credentials
none required

Prerequisites

  • An always-on server (e.g. Oracle Cloud Always Free ARM: 2 cores / 12 GB)
  • A running FreeLLMAPI backend (see the FreeLLMAPI recipe)
  • A spare phone number for WhatsApp

Steps

  1. 1

    Author the Hermes wiring config and validate it

    Capture how Hermes will be wired: a custom model endpoint at your FreeLLMAPI backend, a model whose context meets Hermes's 64k floor, and a WhatsApp gateway on a spare number. CI checks the wiring; the hermes install, hermes model picker, and gateway linking are fenced.

    cat > hermes-wiring.json <<'JSON'
    {
      "model": {
        "provider": "custom",
        "base_url": "http://localhost:3001/v1",
        "api_key": "freellmapi-REPLACE_WITH_YOUR_KEY",
        "model": "auto",
        "context_window": 65536
      },
      "gateway": { "platform": "whatsapp", "number": "spare" }
    }
    JSON
    node -e '
    const fs = require("fs");
    const c = JSON.parse(fs.readFileSync("hermes-wiring.json", "utf8"));
    function bad(m) { console.error("BAD: " + m); process.exit(1); }
    const model = c.model || {};
    if (!model.base_url || model.base_url.indexOf(":3001/v1") === -1) bad("model.base_url must point at the FreeLLMAPI endpoint :3001/v1");
    const ctx = model.context_window || 0;
    if (ctx < 64000) bad("model.context_window must be at least 64000 (Hermes floor); pick a larger-context free model");
    const gw = c.gateway || {};
    const platforms = ["whatsapp", "telegram", "signal", "discord", "slack", "email", "cli"];
    if (platforms.indexOf(gw.platform) === -1) bad("gateway.platform must be a supported Hermes platform");
    if (gw.platform === "whatsapp" && gw.number === "main") bad("use a spare WhatsApp number, never your main one");
    console.log("wiring OK: Hermes -> FreeLLMAPI (:3001/v1), context " + ctx + " >= 64000, gateway " + gw.platform + " on a " + gw.number + " number");
    '
  2. 2

    Build the stack and link WhatsApp (the steps CI does not run)

    On the server: install Hermes (curl install), run hermes model and pick the Custom Endpoint with your FreeLLMAPI base URL and key, verify a plain `hermes` terminal chat works, then hermes gateway setup -> WhatsApp to link the spare number, and keep hermes gateway alive under systemd/pm2. Do not add WhatsApp on top of a broken chat. All of this is fenced.

Eval, 2 fixtures

Last passed: verified today
  • wiring-okcontainstimeout 30s · max $0

    Expected: wiring OK: Hermes -> FreeLLMAPI (:3001/v1), context 65536 >= 64000, gateway whatsapp on a spare number

  • clean-exitexit_codetimeout 30s · max $0

    Expected: 0

Results

Hermes (Nous Research) is the brain: it keeps memory and connects to WhatsApp, Telegram, and Signal out of the box. Pointed at a FreeLLMAPI custom endpoint, the model cost drops to zero. The whole stack fits on an Oracle Cloud Always Free ARM VM. Hermes requires a large-context model (its docs state a 64k floor), so this recipe enforces that in the wiring. It is a weekend project, not a one-click install, and WhatsApp does not officially welcome bots on a personal account, so use a spare number.

Did this work for you?

Our CI checks the setup runs. You tell us if the whole thing worked. Tell us straight.

Related workflows

Liked this workflow?

Get new verified workflows in WebAfterAI, three issues a week (Tue, Thu, Sat).