ContentCommercialFreeActiveMachine-verified· beginner · ~10 min setup

Swap to a cheap image model, but guard the cases it loses

Default image generation to a cheap model for general scenes, while proving the cases the premium model dominates (text in the frame, charts, precise layout) are still routed to it, so the swap saves money without quietly degrading the work that has words in it.

by Shilpa Mitra· verified today· v1.0.0

Run this workflow

CI-verified, 2/2 fixtures passing.

Build this with your agent

One copy-paste hands Claude Code, Codex, or Cursor the full recipe, steps included, nothing to fetch.

Intended Use

Teams generating images at volume who want the cheap model's price without shipping garbled text on slides and infographics. CI validates the swap-guard table: every case you declare the premium model dominates is routed to it, and at least one case routes to cheap. No key, no image call. The actual generation and any quality judgement are fenced.

Not for

Treating the gap as a flat percentage, on text-in-image and precise layout GPT-Image-2 is not a few percent better, it is a different class; route those to premium
Assuming the cheap model is self-hostable, earlier Wan releases (2.1/2.2) shipped open weights but Wan 2.5 is API-only, so check the version before planning to run it yourself
Stakes where one wrong frame is expensive, keep a human check on the cheap output for anything customer-facing

The Stack

Wan (Alibaba)cheap default image model

Tested Against

artificialanalysis.ai image leaderboard (2026-06)node@20

Side effects & data flow

Network: none, local only
Writes: ./decision.json
Credentials: none required

Prerequisites

API access to a cheap image model (e.g. Wan) and a premium one (e.g. GPT-Image-2) to actually generate

Steps

Write the swap-guard table and validate it

Declare the cheap default, the premium model, the cases the premium model dominates, and a route for each case. CI checks that every dominated case routes to premium and that something still routes to cheap (otherwise there is no saving). Generating images needs your keys and is fenced.

cat > decision.json <<'JSON'
{
  "modality": "image",
  "cheap": "wan-2.5",
  "premium": "gpt-image-2",
  "dominated_cases": ["text_in_frame", "charts_or_layout"],
  "route": [
    { "when": "text_in_frame", "use": "premium" },
    { "when": "charts_or_layout", "use": "premium" },
    { "when": "general_scene", "use": "cheap" }
  ]
}
JSON
node -e '
const fs = require("fs");
const c = JSON.parse(fs.readFileSync("decision.json", "utf8"));
function bad(m) { console.error("BAD: " + m); process.exit(1); }
if (!c.cheap || !c.premium) bad("need both a cheap and a premium model");
const route = c.route || [];
const ruleFor = {};
for (const r of route) { if (r.when && r.use) ruleFor[r.when] = r.use; }
const dom = c.dominated_cases || [];
if (dom.length < 1) bad("declare the cases the premium model dominates");
for (const d of dom) {
  if (ruleFor[d] !== "premium") bad("dominated case " + d + " is not routed to the premium model");
}
const cheapCount = route.filter(function (r) { return r.use === "cheap"; }).length;
if (cheapCount < 1) bad("nothing routes to the cheap model, so there is no saving");
console.log("config OK: " + c.modality + " swap " + c.cheap + " -> " + c.premium + "; " + dom.length + " dominated case(s) all routed to premium, " + cheapCount + " case(s) to cheap");
'

2
Generate, routing each request by its content (the model step, not checked by CI)
At generation time, tag each prompt by whether it has text or precise layout, and send it to the model the table chose: cheap for general scenes, premium when words are in the frame. The same pattern applies to video (cheap for iterations, premium for the physics-heavy hero shot). The generation and quality are fenced.

Eval, 2 fixtures

Last passed: verified today

guard-okcontainstimeout 30s · max $0
Expected: config OK: image swap wan-2.5 -> gpt-image-2; 2 dominated case(s) all routed to premium, 1 case(s) to cheap
clean-exitexit_codetimeout 30s · max $0
Expected: 0

Results

For general scenes, a cheap model like Wan is roughly seven to eight times cheaper per image than top-tier GPT-Image-2, and a normal viewer barely notices the gap. But GPT-Image-2 leads the Artificial Analysis image leaderboard by the largest margin it has recorded, specifically on text inside images, dense layouts, infographics, and multilingual typography. So the honest swap is conditional: cheap for pictures, premium when there are words in the frame.

Did this work for you?

Our CI checks the setup runs. You tell us if the whole thing worked. Tell us straight.

Related workflows

Liked this workflow?

Get new verified workflows in WebAfterAI, three issues a week (Tue, Thu, Sat).