two-phase rag pipeline assembly with lcel orchestration
Constructs a complete Retrieval-Augmented Generation pipeline using LangChain Expression Language (LCEL) that separates indexing (one-time document embedding and vector store population) from query execution (per-request retrieval and LLM synthesis). The rag_chain in full_basic_rag.ipynb assembles retriever, prompt templates, and LLM into a single composable expression, enabling declarative pipeline definition without imperative control flow.
Unique: Uses LangChain Expression Language (LCEL) to declaratively compose indexing and query phases into a single reusable chain expression, eliminating boilerplate control flow and enabling runtime chain introspection and modification
vs alternatives: Simpler than building RAG from scratch with raw vector store APIs, and more transparent than black-box RAG frameworks because LCEL makes each pipeline step explicit and swappable
multi-query retrieval with llm-generated query variants
Generates multiple semantically-diverse query variants from a single user question using an LLM, then retrieves documents against all variants in parallel, unions the results, and deduplicates to improve recall. Implemented in Notebook 2 via LLM prompt templates that instruct the model to generate alternative phrasings, followed by concurrent retriever calls and result aggregation.
Unique: Leverages LLM-in-the-loop query expansion with parallel retrieval and union-based deduplication, avoiding hand-crafted query expansion rules and adapting dynamically to domain-specific terminology
vs alternatives: More effective than single-query retrieval for sparse corpora, and more flexible than static query expansion templates because the LLM adapts variants to the specific query context
prompt engineering and template management for rag synthesis
Manages LLM prompts using LangChain PromptTemplate, enabling parameterized prompt construction with context injection, variable substitution, and format specification. Notebooks demonstrate prompts for retrieval evaluation, query generation, answer synthesis, and re-ranking, with explicit separation of system instructions, context, and user input.
Unique: Uses LangChain PromptTemplate for parameterized prompt construction with explicit variable injection, enabling prompt reuse and experimentation without string concatenation
vs alternatives: More maintainable than string concatenation, and more flexible than hard-coded prompts because templates are reusable and variables are explicit
jupyter notebook-based progressive learning curriculum
Provides five structured Jupyter notebooks (Notebooks 1-5) that progressively introduce RAG techniques from basic setup to advanced retrieval and self-correction. Each notebook builds on the previous, introducing new techniques (multi-query, routing, advanced indexing, re-ranking) with executable code, explanations, and reference links. The progression enables learners to understand RAG incrementally rather than all-at-once.
Unique: Provides a structured 5-notebook curriculum that progressively introduces RAG techniques with executable code and explanations, enabling self-paced learning from basic to advanced patterns
vs alternatives: More comprehensive than blog posts or tutorials because it covers the full RAG spectrum, and more practical than academic papers because code is executable and runnable
production boilerplate rag chatbot (full_basic_rag.ipynb)
Provides a self-contained, production-ready RAG chatbot implementation in full_basic_rag.ipynb that can be adapted to custom documents, LLMs, and vector stores. The boilerplate includes document loading, embedding, vector store setup, retrieval chain assembly, and inference loop, enabling developers to fork and customize without building from scratch.
Unique: Provides a complete, self-contained RAG chatbot in a single notebook that can be forked and customized without external dependencies or infrastructure setup
vs alternatives: Faster to deploy than building RAG from scratch, and more customizable than SaaS RAG platforms because code is fully visible and modifiable
semantic and logical routing with runnablebranch
Routes incoming queries to different retrieval or processing paths based on semantic classification or logical rules using LangChain's RunnableBranch construct. Notebook 3 demonstrates routing via LLM classification (e.g., 'is this a factual question or a reasoning task?') and conditional branching to specialized chains (e.g., HyDE for hypothetical document expansion, RAG-Fusion for multi-perspective retrieval).
Unique: Uses LangChain's RunnableBranch to declaratively define conditional routing logic without imperative control flow, enabling runtime inspection and modification of routing conditions
vs alternatives: More maintainable than hard-coded if-else routing, and more transparent than learned routing models because conditions are explicit and auditable
advanced document indexing with multi-vector and parent-document retrieval
Implements sophisticated indexing strategies (Notebook 4) including MultiVectorRetriever for storing summaries/questions alongside full documents, InMemoryByteStore for metadata caching, and Parent Document Retriever for retrieving larger context chunks while querying against smaller summaries. These patterns decouple the retrieval unit (summary) from the context unit (full document), improving both precision and context quality.
Unique: Decouples retrieval granularity (summaries) from context granularity (full documents) using MultiVectorRetriever and parent-child mappings, enabling precise relevance matching without losing contextual information
vs alternatives: More effective than chunk-based retrieval for long documents because it retrieves at the document level while scoring at the summary level, reducing context fragmentation
retrieval re-ranking with cross-encoder models and crag
Applies learned re-ranking to retrieval results using cross-encoder models (e.g., Cohere Rerank API) that score document-query pairs jointly, improving ranking quality beyond embedding-based similarity. Notebook 5 integrates CohereRerank and demonstrates Corrective RAG (CRAG) with LangGraph, which evaluates retrieval quality and iteratively refines queries or retrieves additional documents if confidence is low.
Unique: Combines cross-encoder re-ranking with Corrective RAG (CRAG) using LangGraph state machines, enabling iterative retrieval refinement with explicit quality validation rather than single-pass retrieval
vs alternatives: More effective than embedding-only ranking for complex queries, and more robust than static retrieval because CRAG detects and corrects failures automatically
+5 more capabilities