Route through a gateway with a tested open-weights fallback
Keep model access from being a single point of failure: route through an OpenAI-compatible gateway and pin a fallback that is open-weights and has actually been tested, so a pulled or deprecated model is a two-minute config change, not a lost week.
Run this workflow
CI-verified, 2/2 fixtures passing.
Build this with your agent
One copy-paste hands Claude Code, Codex, or Cursor the full recipe, steps included, nothing to fetch.
Intended Use
Anyone whose work depends on a specific model staying available. CI validates the failover config: it routes through a gateway base_url (not a hard-wired model), names a primary, and pins a fallback that differs from the primary, is marked open-weights, carries a tested_on date, and lists at least one smoke prompt you actually ran. No key, no model call. The runs are fenced.
Not for
- Treating a listed-but-untested backup as insurance, a backup you have never run is a hope; CI requires a tested_on date and smoke prompts precisely so this recipe cannot pass on a fantasy fallback
- Assuming an open model equals the frontier, the biggest open models need serious hardware; the realistic fallback is a smaller open model or a cheap cloud host, not a home rig matching a flagship
- Forgetting the gateway is itself a dependency, one more company whose terms can change; it shrinks switching time, it does not remove all risk
- Building anything important on a preview model, those get the shortest-notice removals; prefer generally available models with a real deprecation policy
The Stack
Tested Against
OpenAI-compatible gateway config (2026-07)node@20Side effects & data flow
- Network
- none, local only
- Writes
- ./failover.json
- Credentials
- none required
Prerequisites
- A gateway account (e.g. OpenRouter) and access to both models to actually run the failover
Steps
- 1
Write the failover config and validate it
Point at a gateway base_url, set your primary, and pin a fallback that is open-weights, dated with when you last tested it, and carries the smoke prompts you ran. CI checks all of that is present, so the config cannot pass with an untested or recallable fallback. Actually failing over needs keys and is fenced.
cat > failover.json <<'JSON' { "base_url": "https://openrouter.ai/api/v1", "primary": "anthropic/claude-fable-5", "fallback": { "model": "z-ai/glm-5.2", "open_weights": true, "tested_on": "2026-07-01", "smoke_prompts": ["summarize this ticket", "extract the invoice fields", "draft a reply in my voice"] } } JSON node -e ' const fs = require("fs"); const c = JSON.parse(fs.readFileSync("failover.json", "utf8")); function bad(m) { console.error("BAD: " + m); process.exit(1); } if (!c.base_url) bad("no gateway base_url (route through a gateway, do not hard-wire a model)"); if (!c.primary) bad("no primary model set"); const f = c.fallback || {}; if (!f.model) bad("no fallback model set"); if (f.model === c.primary) bad("fallback must differ from the primary"); if (f.open_weights !== true) bad("fallback must be open-weights (a hosted fallback can be recalled too)"); const parts = String(f.tested_on || "").split("-"); if (parts.length !== 3 || !parts[0]) bad("fallback needs a tested_on date (a backup you never ran is a hope)"); const sp = f.smoke_prompts || []; if (!Array.isArray(sp) || sp.length < 1) bad("fallback needs at least one smoke prompt you actually ran"); console.log("config OK: gateway primary " + c.primary + " with tested open-weights fallback " + f.model + " (tested " + f.tested_on + ", " + sp.length + " smoke prompt(s))"); ' - 2
Rehearse the failover (the model step, not checked by CI)
Run your smoke prompts through the fallback for real, on a schedule, and confirm it holds up on the tasks that matter. When the primary vanishes, flip one line to the fallback. Keep your prompts and context in a portable form so the switch is copy-and-paste, not a rebuild. The runs and quality are fenced.
Eval, 2 fixtures
Last passed: verified todayfailover-okcontainstimeout 30s · max $0Expected:
config OK: gateway primary anthropic/claude-fable-5 with tested open-weights fallback z-ai/glm-5.2 (tested 2026-07-01, 3 smoke prompt(s))clean-exitexit_codetimeout 30s · max $0Expected:
0
Results
Models vanish for reasons outside your control: Fable 5 was pulled by export controls days after launch and restored weeks later, and vendors retire models on ordinary deprecation calendars too. The fix is not zero risk, it is a smaller blast radius: a gateway makes the swap one line, and a tested open-weights fallback cannot be recalled by a vendor or a government the way a hosted model can.
Did this work for you?
Our CI checks the setup runs. You tell us if the whole thing worked. Tell us straight.
Related workflows
- ReMe pattern: define prospective memory as a schedule your agent can tick off
- Grind a huge one-time job overnight on a free tier's tiny rate limit
- Run GLM-5.2 for the bulk, escalate the hard turns to Opus 4.8
- Teach OpenCode Go your weekly chore once, then run it in minutes
- Let a free model triage your reading: one-line summary + reply flag
- Scrape politely: honor robots.txt and a crawl delay (the part most skip)
Liked this workflow?
Get new verified workflows in WebAfterAI, three issues a week (Tue, Thu, Sat).