Key PM Decisions
1. Scope the knowledge base.
What data sources are included? What’s excluded? Start narrow (one document collection) and expand. Every new source adds complexity and potential failure modes.
2. Define freshness requirements.
How quickly must new information be available? Real-time (minutes), near-real-time (hours), or batch (daily)? Freshness requirements drive infrastructure cost.
3. Set the citation standard.
Must every answer cite its sources? Can the AI say “I don’t know”? How are citations displayed? Source citation is the primary trust mechanism in RAG products.
4. Define the “I don’t know” behavior.
When no relevant documents are found, the AI should say so rather than hallucinate. This requires a retrieval confidence threshold: below it, the system admits uncertainty rather than guessing.
The RAG Maturity Ladder
Level 1: Basic RAG
Vector search, fixed chunking, single data source. Good for internal tools and proofs of concept. 2–4 weeks to build.
Level 2: Production RAG
Hybrid search, re-ranking, multiple data sources, automated ingestion, source citations. Good for customer-facing products. 2–3 months to build.
Level 3: Advanced RAG
Agentic RAG (the AI decides what to search and when), multi-step retrieval, query decomposition, graph-based knowledge structures. For complex domains with interconnected information. 4–6+ months.
Level 4: Enterprise RAG
Multi-tenant access control, compliance audit trails, real-time ingestion, cross-language retrieval, feedback-driven re-indexing. For regulated industries at scale. Ongoing investment.
The bottom line: RAG is the most common architecture for enterprise AI products because it solves the knowledge problem without retraining models. But “just add RAG” is deceptively simple. The quality depends on chunking strategy, retrieval approach, source data quality, and continuous evaluation. The PM who understands the full pipeline can diagnose issues, set realistic expectations, and make informed trade-offs between cost, quality, and freshness.