AutomationHybridFreeActiveMachine-verified· beginner · ~15 min setup

Route through a gateway with a tested open-weights fallback

Keep model access from being a single point of failure: route through an OpenAI-compatible gateway and pin a fallback that is open-weights and has actually been tested, so a pulled or deprecated model is a two-minute config change, not a lost week.

by Shilpa Mitra· verified today· v1.0.0

Run this workflow

CI-verified, 2/2 fixtures passing.

Build this with your agent

One copy-paste hands Claude Code, Codex, or Cursor the full recipe, steps included, nothing to fetch.

Intended Use

Anyone whose work depends on a specific model staying available. CI validates the failover config: it routes through a gateway base_url (not a hard-wired model), names a primary, and pins a fallback that differs from the primary, is marked open-weights, carries a tested_on date, and lists at least one smoke prompt you actually ran. No key, no model call. The runs are fenced.

Not for

  • Treating a listed-but-untested backup as insurance, a backup you have never run is a hope; CI requires a tested_on date and smoke prompts precisely so this recipe cannot pass on a fantasy fallback
  • Assuming an open model equals the frontier, the biggest open models need serious hardware; the realistic fallback is a smaller open model or a cheap cloud host, not a home rig matching a flagship
  • Forgetting the gateway is itself a dependency, one more company whose terms can change; it shrinks switching time, it does not remove all risk
  • Building anything important on a preview model, those get the shortest-notice removals; prefer generally available models with a real deprecation policy

The Stack

Tested Against

OpenAI-compatible gateway config (2026-07)node@20

Side effects & data flow

Network
none, local only
Writes
./failover.json
Credentials
none required

Prerequisites

  • A gateway account (e.g. OpenRouter) and access to both models to actually run the failover

Steps

  1. 1

    Write the failover config and validate it

    Point at a gateway base_url, set your primary, and pin a fallback that is open-weights, dated with when you last tested it, and carries the smoke prompts you ran. CI checks all of that is present, so the config cannot pass with an untested or recallable fallback. Actually failing over needs keys and is fenced.

    cat > failover.json <<'JSON'
    {
      "base_url": "https://openrouter.ai/api/v1",
      "primary": "anthropic/claude-fable-5",
      "fallback": {
        "model": "z-ai/glm-5.2",
        "open_weights": true,
        "tested_on": "2026-07-01",
        "smoke_prompts": ["summarize this ticket", "extract the invoice fields", "draft a reply in my voice"]
      }
    }
    JSON
    node -e '
    const fs = require("fs");
    const c = JSON.parse(fs.readFileSync("failover.json", "utf8"));
    function bad(m) { console.error("BAD: " + m); process.exit(1); }
    if (!c.base_url) bad("no gateway base_url (route through a gateway, do not hard-wire a model)");
    if (!c.primary) bad("no primary model set");
    const f = c.fallback || {};
    if (!f.model) bad("no fallback model set");
    if (f.model === c.primary) bad("fallback must differ from the primary");
    if (f.open_weights !== true) bad("fallback must be open-weights (a hosted fallback can be recalled too)");
    const parts = String(f.tested_on || "").split("-");
    if (parts.length !== 3 || !parts[0]) bad("fallback needs a tested_on date (a backup you never ran is a hope)");
    const sp = f.smoke_prompts || [];
    if (!Array.isArray(sp) || sp.length < 1) bad("fallback needs at least one smoke prompt you actually ran");
    console.log("config OK: gateway primary " + c.primary + " with tested open-weights fallback " + f.model + " (tested " + f.tested_on + ", " + sp.length + " smoke prompt(s))");
    '
  2. 2

    Rehearse the failover (the model step, not checked by CI)

    Run your smoke prompts through the fallback for real, on a schedule, and confirm it holds up on the tasks that matter. When the primary vanishes, flip one line to the fallback. Keep your prompts and context in a portable form so the switch is copy-and-paste, not a rebuild. The runs and quality are fenced.

Eval, 2 fixtures

Last passed: verified today
  • failover-okcontainstimeout 30s · max $0

    Expected: config OK: gateway primary anthropic/claude-fable-5 with tested open-weights fallback z-ai/glm-5.2 (tested 2026-07-01, 3 smoke prompt(s))

  • clean-exitexit_codetimeout 30s · max $0

    Expected: 0

Results

Models vanish for reasons outside your control: Fable 5 was pulled by export controls days after launch and restored weeks later, and vendors retire models on ordinary deprecation calendars too. The fix is not zero risk, it is a smaller blast radius: a gateway makes the swap one line, and a tested open-weights fallback cannot be recalled by a vendor or a government the way a hosted model can.

Did this work for you?

Our CI checks the setup runs. You tell us if the whole thing worked. Tell us straight.

Related workflows

Liked this workflow?

Get new verified workflows in WebAfterAI, three issues a week (Tue, Thu, Sat).