Building Reliable RAG
Data prep, retrieval quality, evals, and monitoring for messy real-world data.
3–5 days ML engineers, data engineers, backend developers
Overview
Most RAG systems fail silently — they return plausible-sounding answers backed by the wrong documents. This masterclass focuses on building RAG pipelines that are accurate, measurable, and maintainable against real-world messy data.
What you’ll build
A complete RAG pipeline with:
- Robust ingestion for heterogeneous document types (PDFs, HTML, databases, Confluence, Slack)
- Hybrid retrieval (dense + sparse + reranking)
- Automated quality evaluation suite
- Production monitoring dashboard
Curriculum
Day 1 — Data Ingestion & Chunking
- Document parsing strategies for PDFs, tables, images, and mixed-format sources
- Chunking methods: fixed-size, semantic, recursive, document-structure-aware
- Metadata extraction and enrichment
- Handling multilingual and domain-specific content
- Data cleaning pipelines for noisy enterprise data
Day 2 — Embeddings & Retrieval
- Embedding model selection and fine-tuning for your domain
- Vector databases: Qdrant, Weaviate, pgvector — choosing the right one
- Hybrid search: combining dense vectors, BM25, and metadata filters
- Reranking with cross-encoders
- Query transformation: HyDE, multi-query, step-back prompting
Day 3 — Generation & Grounding
- Prompt engineering for grounded generation
- Citation and source attribution
- Handling “I don’t know” — abstention and confidence estimation
- Multi-turn conversational RAG
- Structured output from RAG (tables, summaries, comparisons)
Day 4 — Evaluation & Testing
- Building evaluation datasets from your domain
- Metrics: retrieval precision/recall, answer faithfulness, relevance
- Automated evaluation with LLM-as-judge
- Regression testing: catching quality drops before deployment
- Human-in-the-loop evaluation workflows
Day 5 — Production & Monitoring
- Deployment architecture: sync vs async, caching strategies
- Monitoring retrieval quality in production
- Feedback loops: user signals to improve retrieval
- Cost optimisation: balancing quality and latency
- Capstone: end-to-end pipeline on your data
Prerequisites
- Python proficiency
- Basic familiarity with SQL and APIs
- Sample documents from your domain (we’ll use them in exercises)
Outcomes
Your team leaves with a tested RAG pipeline on your own data, an evaluation suite to catch regressions, and a monitoring setup for production.
Interested in this masterclass?
Tell me about your team and I'll tailor the programme to your needs.
Book this masterclass