AgentsHybridFreeActiveMachine-verified· intermediate · ~10 min setup

OrcaRouter: only fan out when it is worth it

Gate the expensive fan-out behind a difficulty condition so easy chat stays cheap and only hard requests pay for a panel.

by Shilpa Mitra· verified today· v1.0.0

Run this workflow

CI-verified, 2/2 fixtures passing.

Build this with your agent

One copy-paste hands Claude Code, Codex, or Cursor the full recipe, steps included, nothing to fetch.

Intended Use

Anyone pointing real traffic at a multi-model router. CI runs OrcaRouter's DSL lint: routing.yaml parses, the cheap_chat rule delegates to a cheap strategy, and the fan-out rule is gated behind a difficulty condition rather than matching everything, with a default present. No keys, no calls. The routing decisions are fenced.

Not for

  • Trusting difficulty as ground truth, it is a classifier's guess; watch routing in shadow mode for a week and tune the thresholds
  • Skipping the gate, ungated fan-out is the failure mode that produces a quiet bill

The Stack

Tested Against

docs.orcarouter.ai/routing/routing-dsl (2026-06)ruby@3.x (YAML stdlib)

Side effects & data flow

Network
none, local only
Writes
./routing.yaml
Credentials
none required

Prerequisites

  • An OrcaRouter account (hosted DSL, BYOK)
  • Provider API keys to actually run it

Steps

  1. 1

    Author the gated routing rules and lint them

    Write routing.yaml: cheap chat to the cheapest model, a fan-out gated behind difficulty > 0.6, and a repair rule that escalates after failed tests. CI parses the DSL and asserts cheap_chat delegates cheap and the fan-out rule carries a difficulty gate (not a match-all), with a default present.

    cat > routing.yaml <<'YAML'
    version: 1
    rules:
      - id: cheap_chat
        when: task_class == "chat" && difficulty < 0.3
        use: { delegate: cheapest }
      - id: hard_only_fanout
        when: difficulty > 0.6
        use:
          parallel:
            - { model: "anthropic/claude-opus-4.8" }
            - { model: "openai/gpt-4o", samples: 2 }
          arbiter:
            strategy: best_of_n
            model: "anthropic/claude-opus-4.8"
      - id: repair_after_failed_test
        when: agent_state.last_test_failed && agent_state.consecutive_errors >= 2
        use:
          model: "anthropic/claude-opus-4.8"
          reason_tag: repair
    default:
      delegate: balanced
    YAML
    ruby -ryaml -e '
    c = YAML.safe_load(File.read("routing.yaml"))
    abort "BAD: version must be 1" unless c["version"] == 1
    abort "BAD: no default" unless c["default"]
    rules = c["rules"] || []
    chat = rules.find { |r| r["id"] == "cheap_chat" }
    abort "BAD: cheap_chat does not delegate cheapest" unless chat && (chat["use"] || {})["delegate"] == "cheapest"
    fan = rules.find { |r| r["id"] == "hard_only_fanout" }
    abort "BAD: no hard_only_fanout rule" unless fan
    gate = fan["when"].to_s
    abort "BAD: fan-out is not gated behind difficulty (would match everything)" unless gate.include?("difficulty")
    puts "config OK: cheap_chat -> cheapest, fan-out gated behind a difficulty condition, default present"
    '
  2. 2

    Watch it in shadow mode, then go live (not checked by CI)

    Run OrcaRouter's shadow mode for a week to see what the rules would have done before they touch live traffic, then tune the difficulty thresholds. The routing decisions are fenced.

Eval, 2 fixtures

Last passed: verified today
  • gated-okcontainstimeout 30s · max $0

    Expected: config OK: cheap_chat -> cheapest, fan-out gated behind a difficulty condition, default present

  • clean-exitexit_codetimeout 30s · max $0

    Expected: 0

Results

Every parallel leg bills separately, so fanning out every request is how a clever setup becomes a surprise invoice. Send easy chat to the cheapest model, fan out only the hard ones, and escalate after a failed test.

Did this work for you?

Our CI checks the setup runs. You tell us if the whole thing worked. Tell us straight.

Liked this workflow?

Get new verified workflows in WebAfterAI, three issues a week (Tue, Thu, Sat).