LlamaIndex: index your documents and query them at runtime
Point LlamaIndex at a document corpus and build a VectorStoreIndex so an agent can retrieve the relevant chunks at query time instead of stuffing everything into context.
Run this workflow
CI-verified, 2/2 fixtures passing.
Build this with your agent
One copy-paste hands Claude Code, Codex, or Cursor the full recipe, steps included, nothing to fetch.
Intended Use
Anyone whose knowledge corpus is too large or too changeable to keep in a prompt. CI installs llama-index-core and verifies the three core retrieval abstractions import cleanly: VectorStoreIndex, SimpleDirectoryReader, and StorageContext. Building an index and querying it need an embedding model and an LLM (OpenAI by default), so those steps are fenced.
Not for
- Static knowledge that fits in a prompt, RAG adds latency and a retrieval layer for no gain when the knowledge is small
- Expecting CI to verify retrieval quality, that depends on chunking, embedding, and query — the fenced model steps
- Fully offline use without swapping the default embedding/LLM backends
The Stack
Tested Against
llama-index-core@latestpython@3.12Side effects & data flow
- Network
- PyPI, install only
- Writes
- ./.venv/
- Credentials
- Embedding + LLM key, for the fenced index/query steps only
Prerequisites
- Python 3.10+
- pip
- An embedding model + LLM key for the fenced index/query steps (OpenAI by default)
Steps
- 1
Install llama-index-core and verify the retrieval abstractions import
pip install llama-index-core, then confirm the three abstractions the docs build on: VectorStoreIndex (the index), SimpleDirectoryReader (the loader), and StorageContext (the persistence layer). CI runs exactly this, no key.
python3 -m venv .venv .venv/bin/pip install -q llama-index-core .venv/bin/python - <<'EOF' from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, StorageContext for cls in [VectorStoreIndex, SimpleDirectoryReader, StorageContext]: assert callable(cls), f"{cls.__name__} is not callable" print("llamaindex imports OK: VectorStoreIndex, SimpleDirectoryReader, StorageContext all available") EOF - 2
Build the index and query it (the model step, not checked by CI)
Set your embedding and LLM keys, load your corpus with SimpleDirectoryReader, call VectorStoreIndex.from_documents(), then index.as_query_engine().query(). Persist with storage_context if you want the index to survive restarts. The embedding and retrieval are fenced.
Eval, 2 fixtures
Last passed: verified todayimports-okcontainstimeout 900s · max $0Expected:
llamaindex imports OK: VectorStoreIndex, SimpleDirectoryReader, StorageContext all availableclean-exitexit_codetimeout 900s · max $0Expected:
0
Results
The standard RAG pattern with 40+ integrations and pluggable embedding and vector-store backends. Retrieval memory is the right pick when the knowledge corpus is too large or too changeable to bake into the model.
Did this work for you?
Our CI checks the setup runs. You tell us if the whole thing worked. Tell us straight.
Related workflows
- Crawl4AI: a page to clean, LLM-ready markdown (no API key)
- Khoj + Fable 5: A Second Brain That Knows Your Notes
- Cognee: Knowledge-Graph Memory Over Your Documents
- Obsidian × MCPVault: Write a Note from Any MCP Client
- Obsidian × MCPVault: Read a Note from Any MCP Client
- Obsidian × MCPVault: The Argument Builder
Liked this workflow?
Get new verified workflows in WebAfterAI, three issues a week (Tue, Thu, Sat).