OKF: generate a bundle from your schema, then ground it with citations
Mirror Google's reference pattern: a model drafts one OKF concept per table or module, and a stricter project rule requires a # Citations section on anything with a resource, so the knowledge is checkable, not just plausible.
Run this workflow
CI-verified, 3/3 fixtures passing.
Build this with your agent
One copy-paste hands Claude Code, Codex, or Cursor the full recipe, steps included, nothing to fetch.
Intended Use
Generating an OKF bundle from a database or codebase. CI runs the v0.1 conformance check AND a stricter project rule: every concept that carries a `resource` must also have a `# Citations` section. No key, no model call. The drafting and enrichment passes are fenced; do not ship generated concepts unreviewed.
Not for
- Shipping model-generated concepts without review, a wrong knowledge base is worse than none because agents trust it
- Treating citations as optional here, the spec makes them optional but this workflow requires them for grounding
The Stack
Tested Against
okf@0.1knowledge-catalog okf/SPEC.md (2026-06)ruby@3.x (YAML stdlib)Side effects & data flow
- Network
- none, local only
- Writes
- ./<bundle>/ (markdown files)
- Credentials
- none required
Prerequisites
- A model to draft concepts (the fenced step)
- Authoritative docs to cite
Steps
- 1
Draft concepts with citations, then check conformance + grounding
For each table or module, have a model draft an OKF concept with a # Schema section and cross-links, then a second pass that attaches a # Citations section pointing at authoritative sources. CI checks v0.1 conformance and, as a stricter project rule, that every concept with a `resource` carries a # Citations section.
mkdir -p bundle/apis bundle/tables cat > bundle/apis/create-order.md <<'MD' --- type: API Endpoint title: Create Order description: Creates an order from a validated cart. resource: https://api.acme.dev/v1/orders tags: [orders, api] --- # Schema Request: cart_id (string, required). Returns the created order. See the [orders table](/tables/orders.md) it writes to. # Citations [1] OpenAPI spec for /v1/orders: https://api.acme.dev/openapi.json MD cat > bundle/tables/orders.md <<'MD' --- type: BigQuery Table title: Orders description: One row per completed customer order. resource: https://console.cloud.google.com/bigquery?d=sales&t=orders --- # Schema One row per completed order. # Citations [1] Dataset README: https://example.com/sales/README.md MD ruby -ryaml -e ' require "date" RESERVED = ["index.md", "log.md"] concepts = 0; grounded = 0; errors = [] Dir.glob("bundle/**/*.md").sort.each do |path| next if RESERVED.include?(File.basename(path)) lines = File.read(path).lines unless lines.first && lines.first.strip == "---" errors << (path + ": no frontmatter"); next end rest = lines[1..] idx = rest.index { |l| l.strip == "---" } unless idx; errors << (path + ": unterminated frontmatter"); next; end fm = (begin; YAML.safe_load(rest[0...idx].join, permitted_classes: [Date, Time]); rescue; nil; end) unless fm.is_a?(Hash) && !fm["type"].to_s.empty? errors << (path + ": missing or empty type"); next end concepts += 1 body = rest[(idx + 1)..] || [] unless fm["resource"].to_s.empty? if body.any? { |l| l.strip == "# Citations" } grounded += 1 else errors << (path + ": has a resource but no # Citations section") end end end if errors.empty? puts "OKF bundle conformant + grounded: " + concepts.to_s + " concepts, " + grounded.to_s + " with a resource carry a # Citations section" else puts "NONCONFORMANT: " + errors.join("; "); exit 1 end ' - 2
Review the first pass (the model step, not checked by CI)
Have a human review generated concepts on anything load-bearing; the model will invent column meanings and join keys that are subtly wrong. The drafting and enrichment passes are fenced, CI only proves conformance and that citations are present.
Eval, 3 fixtures
Last passed: verified todayconformant-groundedcontainstimeout 30s · max $0Expected:
OKF bundle conformant + grounded:citationscontainstimeout 30s · max $0Expected:
with a resource carry a # Citations sectionclean-exitexit_codetimeout 30s · max $0Expected:
0
Results
The producer-side workflow, with the discipline that matters: a model documenting a schema will confidently invent semantics, so each load-bearing claim must point back to an authoritative source you can check.
Did this work for you?
Our CI checks the setup runs. You tell us if the whole thing worked. Tell us straight.
Liked this workflow?
Get new verified workflows in WebAfterAI, three issues a week (Tue, Thu, Sat).