codebase-memory-mcp: wire the knowledge graph, stop re-reading files
Point your MCP client at a codebase-memory-mcp server so your coding agent queries the repo's knowledge graph instead of re-reading files into context on every question, cutting token spend without changing answer quality.
Run this workflow
CI-verified, 2/2 fixtures passing.
Build this with your agent
One copy-paste hands Claude Code, Codex, or Cursor the full recipe, steps included, nothing to fetch.
Intended Use
Anyone whose coding agent is burning tokens re-reading the same files. CI validates the MCP server config: the server block is named, the command points at the codebase-memory-mcp binary, and the required index_repository argument is present. The binary download and actual indexing run on your machine and are fenced.
Not for
- Expecting the tool itself to improve answer quality, it is a structural backend that reduces what the agent reads; the model and your prompts still drive the quality
- Very small repos where file-by-file context fits cheaply, the overhead of indexing is not worth it below a few thousand lines
The Stack
Tested Against
DeusData/codebase-memory-mcp (2026-06)node@20Side effects & data flow
- Network
- none, local only
- Writes
- ./mcp-config.json
- Credentials
- none required
Prerequisites
- codebase-memory-mcp binary (curl install: see github.com/DeusData/codebase-memory-mcp)
- An MCP-compatible client (Claude Code, Cursor, etc.)
Steps
- 1
Write the MCP server config and validate it
Add codebase-memory-mcp to your MCP client config (Claude Code uses .claude/settings.json; Cursor uses .cursor/mcp.json). The server block needs a name, the path to the binary as the command, and no extra flags. CI checks the config parses and has the required fields; the binary install and indexing are fenced.
cat > mcp-config.json <<'JSON' { "mcpServers": { "codebase-memory": { "command": "/usr/local/bin/codebase-memory-mcp", "args": [] } } } JSON node -e ' const fs = require("fs"); const c = JSON.parse(fs.readFileSync("mcp-config.json", "utf8")); function bad(m) { console.error("BAD: " + m); process.exit(1); } const servers = c.mcpServers || c.mcp_servers || {}; const keys = Object.keys(servers); if (keys.length === 0) bad("mcpServers must have at least one entry"); const srv = servers[keys[0]]; if (!srv.command || typeof srv.command !== "string") bad("server entry must have a command (path to binary)"); if (!Array.isArray(srv.args)) bad("args must be an array"); console.log("config OK: MCP server " + JSON.stringify(keys[0]) + " points at " + srv.command); ' - 2
Install the binary and index your repo (fenced)
Download the binary for your platform (one-line curl from the releases page), then run index_repository with the absolute path to your repo. The first index on a large codebase takes a few minutes; subsequent queries run sub-millisecond. Watch your token counts before and after on a representative task. Binary install and indexing are fenced.
Eval, 2 fixtures
Last passed: verified todayconfig-okcontainstimeout 30s · max $0Expected:
config OK: MCP serverclean-exitexit_codetimeout 30s · max $0Expected:
0
Results
A preprint (arXiv:2603.27277) evaluated across 31 real-world repos and reported 10x fewer tokens, 2.1x fewer tool calls, and 83% answer quality vs file-by-file exploration. The savings come from feeding the agent less context, not from the tool being smarter. Measure the token drop on your own codebase rather than taking the headline number on faith.
Did this work for you?
Our CI checks the setup runs. You tell us if the whole thing worked. Tell us straight.
Related workflows
- OrcaRouter for coding: judge by passing tests, not by vibes
- Kilo Code: add an MCP tool and bind it to one mode
- Kilo Code: a mode that can only edit the files you let it
- Kilo Code: ship a big feature as isolated subtasks with Orchestrator
- OpenCode: add an MCP tool, then lock it to one agent
- OpenCode: run it headless in scripts and CI, with JSON output
Liked this workflow?
Get new verified workflows in WebAfterAI, three issues a week (Tue, Thu, Sat).