AgentsOpen SourceFreeActiveMachine-verified· intermediate · ~15 min setup

OKF: generate a bundle from your schema, then ground it with citations

Mirror Google's reference pattern: a model drafts one OKF concept per table or module, and a stricter project rule requires a # Citations section on anything with a resource, so the knowledge is checkable, not just plausible.

by Shilpa Mitra· verified today· v1.0.0

Run this workflow

CI-verified, 3/3 fixtures passing.

Build this with your agent

One copy-paste hands Claude Code, Codex, or Cursor the full recipe, steps included, nothing to fetch.

Intended Use

Generating an OKF bundle from a database or codebase. CI runs the v0.1 conformance check AND a stricter project rule: every concept that carries a `resource` must also have a `# Citations` section. No key, no model call. The drafting and enrichment passes are fenced; do not ship generated concepts unreviewed.

Not for

  • Shipping model-generated concepts without review, a wrong knowledge base is worse than none because agents trust it
  • Treating citations as optional here, the spec makes them optional but this workflow requires them for grounding

The Stack

Tested Against

okf@0.1knowledge-catalog okf/SPEC.md (2026-06)ruby@3.x (YAML stdlib)

Side effects & data flow

Network
none, local only
Writes
./<bundle>/ (markdown files)
Credentials
none required

Prerequisites

  • A model to draft concepts (the fenced step)
  • Authoritative docs to cite

Steps

  1. 1

    Draft concepts with citations, then check conformance + grounding

    For each table or module, have a model draft an OKF concept with a # Schema section and cross-links, then a second pass that attaches a # Citations section pointing at authoritative sources. CI checks v0.1 conformance and, as a stricter project rule, that every concept with a `resource` carries a # Citations section.

    mkdir -p bundle/apis bundle/tables
    cat > bundle/apis/create-order.md <<'MD'
    ---
    type: API Endpoint
    title: Create Order
    description: Creates an order from a validated cart.
    resource: https://api.acme.dev/v1/orders
    tags: [orders, api]
    ---
    
    # Schema
    Request: cart_id (string, required). Returns the created order.
    See the [orders table](/tables/orders.md) it writes to.
    
    # Citations
    [1] OpenAPI spec for /v1/orders: https://api.acme.dev/openapi.json
    MD
    cat > bundle/tables/orders.md <<'MD'
    ---
    type: BigQuery Table
    title: Orders
    description: One row per completed customer order.
    resource: https://console.cloud.google.com/bigquery?d=sales&t=orders
    ---
    
    # Schema
    One row per completed order.
    
    # Citations
    [1] Dataset README: https://example.com/sales/README.md
    MD
    ruby -ryaml -e '
    require "date"
    RESERVED = ["index.md", "log.md"]
    concepts = 0; grounded = 0; errors = []
    Dir.glob("bundle/**/*.md").sort.each do |path|
      next if RESERVED.include?(File.basename(path))
      lines = File.read(path).lines
      unless lines.first && lines.first.strip == "---"
        errors << (path + ": no frontmatter"); next
      end
      rest = lines[1..]
      idx = rest.index { |l| l.strip == "---" }
      unless idx; errors << (path + ": unterminated frontmatter"); next; end
      fm = (begin; YAML.safe_load(rest[0...idx].join, permitted_classes: [Date, Time]); rescue; nil; end)
      unless fm.is_a?(Hash) && !fm["type"].to_s.empty?
        errors << (path + ": missing or empty type"); next
      end
      concepts += 1
      body = rest[(idx + 1)..] || []
      unless fm["resource"].to_s.empty?
        if body.any? { |l| l.strip == "# Citations" }
          grounded += 1
        else
          errors << (path + ": has a resource but no # Citations section")
        end
      end
    end
    if errors.empty?
      puts "OKF bundle conformant + grounded: " + concepts.to_s + " concepts, " + grounded.to_s + " with a resource carry a # Citations section"
    else
      puts "NONCONFORMANT: " + errors.join("; "); exit 1
    end
    '
  2. 2

    Review the first pass (the model step, not checked by CI)

    Have a human review generated concepts on anything load-bearing; the model will invent column meanings and join keys that are subtly wrong. The drafting and enrichment passes are fenced, CI only proves conformance and that citations are present.

Eval, 3 fixtures

Last passed: verified today
  • conformant-groundedcontainstimeout 30s · max $0

    Expected: OKF bundle conformant + grounded:

  • citationscontainstimeout 30s · max $0

    Expected: with a resource carry a # Citations section

  • clean-exitexit_codetimeout 30s · max $0

    Expected: 0

Results

The producer-side workflow, with the discipline that matters: a model documenting a schema will confidently invent semantics, so each load-bearing claim must point back to an authoritative source you can check.

Did this work for you?

Our CI checks the setup runs. You tell us if the whole thing worked. Tell us straight.

Liked this workflow?

Get new verified workflows in WebAfterAI, three issues a week (Tue, Thu, Sat).