Skip to content
ADHDecode
  1. Home
  2. Articles
  3. RAG

RAG Articles

50 articles

Measure RAG Retrieval: MRR, NDCG, Hit Rate Explained

Retrieval-Augmented Generation RAG systems don't magically know the answer; they retrieve relevant documents first and then generate.

6 min read

Secure Your RAG Pipeline Against Prompt Injection

Prompt injection is the silent killer of RAG security, where an attacker subtly manipulates your RAG system's behavior by embedding malicious instructio.

4 min read

Self-RAG: Ground Answers Through Iterative Reflection

Self-RAG is a technique that allows Large Language Models LLMs to critically evaluate their own generated text and retrieve relevant information to impr.

4 min read

RAG Sentence Window Retrieval: Expand Context Smartly

RAG Sentence Window Retrieval works by expanding the context around a retrieved document chunk to include surrounding sentences, ensuring the LLM has a .

4 min read

Step-Back Prompting in RAG: Abstract Before Retrieving

Step-back prompting in RAG is actually about avoiding the initial retrieval, not improving it. Let's see what this looks like in practice

3 min read

RAG Table and Image Extraction: Parse Non-Text Content

The RAG system can't parse non-text content because its core logic is designed for string manipulation, and it's encountering binary data or structured .

5 min read

RAG with Tool Use: Integrate Agents for Dynamic Retrieval

RAG with Tool Use: Integrate Agents for Dynamic Retrieval The most surprising thing about RAG with tool use is that the "retrieval" part often becomes t.

3 min read

Compare Vector Databases for RAG: Pinecone, Weaviate, Qdrant

Vector databases are the engine behind Retrieval Augmented Generation RAG, but choosing the right one for your needs can feel like picking a needle in a.

4 min read

RAG A/B Testing: Compare and Validate Retrieval Strategies

RAG A/B testing is less about comparing two AIs and more about comparing how well two different retrieval mechanisms can feed information to a single AI.

4 min read

Agentic RAG: Build Multi-Step Planning Pipelines

Agentic RAG transforms simple retrieval into a dynamic, multi-step reasoning process. Imagine you have a complex question that requires more than just f.

4 min read

RAG Architecture: Every Component Explained

Retrieval Augmented Generation RAG isn't just about finding relevant documents; it's a sophisticated dance where a language model learns to ask better q.

3 min read

RAG Chunking: Find the Optimal Chunk Size

The most surprising thing about RAG chunking is that bigger chunks aren't always better, and sometimes, much smaller chunks can lead to dramatically imp.

4 min read

RAG Citations: Ground Every Answer with Source Attribution

A RAG system's most surprising output isn't the answer itself, but the precise, verifiable lineage of that answer back to its source documents.

2 min read

Build a Code Repository RAG Pipeline

A code repository is a latent knowledge base, and RAG is the key to unlocking its secrets without needing to train a massive, proprietary model.

3 min read

ColPali RAG: Multimodal Document Retrieval with Visuals

ColPali RAG: Multimodal Document Retrieval with Visuals — practical guide covering rag setup, configuration, and troubleshooting with real-world examples.

2 min read

RAG Contextual Compression: Filter Irrelevant Passages

RAG Contextual Compression: Filter Irrelevant Passages — practical guide covering rag setup, configuration, and troubleshooting with real-world examples.

3 min read

Anthropic Contextual Retrieval: Boost RAG Accuracy

The core problem Anthropic's contextual retrieval solves isn't just finding relevant documents, but actively shaping the LLM's understanding by filterin.

3 min read

RAG with Conversation History: Build Multi-Turn QA

RAG with Conversation History: Build Multi-Turn QA — practical guide covering rag setup, configuration, and troubleshooting with real-world examples.

2 min read

Corrective RAG: Adapt Retrieval When Confidence Is Low

The most surprising thing about RAG is that retrieval, the very foundation of RAG, is often its weakest link, and we've been largely ignoring it.

3 min read

Reduce RAG Costs: Caching, Batching, Model Selection

Caching, batching, and model selection aren't just optimizations; they're fundamental to making Retrieval Augmented Generation RAG economically viable f.

5 min read

Fine-Tune Embeddings for Domain-Specific RAG

Fine-tuning embeddings for your Retrieval Augmented Generation RAG system can dramatically improve its ability to understand and retrieve information re.

4 min read

RAG Embedding Cache: Cut Latency and API Costs

The most surprising thing about RAG embedding caches is that they don't actually store embeddings; they store the queries that produced those embeddings.

2 min read

Choose the Right Embedding Model for Your RAG Pipeline

Embedding models are the unsung heroes of Retrieval Augmented Generation RAG, and picking the wrong one can turn your sophisticated pipeline into a glor.

4 min read

Build a Production RAG Pipeline End to End

Retrieval Augmented Generation RAG pipelines are often described as "just connecting a retriever to a generator," but the real magic, and the most surpr.

2 min read

RAG Enterprise Architecture: Scale to Millions of Docs

A RAG system's true power isn't in its retrieval accuracy, but in how it uses that retrieved information to generate a coherent and contextually relevan.

4 min read

Evaluate RAG with RAGAS: Faithfulness, Recall, Precision

The most surprising thing about evaluating Retrieval Augmented Generation RAG is that the metrics you think are about generation quality are actually re.

2 min read

GraphRAG: Combine Knowledge Graphs with Vector Search

GraphRAG isn't just about stuffing your knowledge graph into a vector database; it's about getting vector search to understand the relationships in your.

3 min read

Reduce RAG Hallucinations: Grounding and Verification

The most surprising thing about reducing RAG hallucinations is that the problem isn't just about finding more relevant documents, but about how the retr.

5 min read

RAG Hybrid Search: Combine BM25 and Semantic Retrieval

RAG Hybrid Search: Combine BM25 and Semantic Retrieval — practical guide covering rag setup, configuration, and troubleshooting with real-world examples.

3 min read

HyDE RAG: Generate Hypothetical Documents to Improve Recall

HyDE is a technique that uses a large language model LLM to generate a hypothetical answer to a user's query, and then uses that hypothetical answer as .

3 min read

Optimize RAG Indexing: Faster Ingestion at Scale

The core innovation of RAG isn't just retrieving documents; it's retrieving relevant snippets based on a query, and the indexing process is where that s.

3 min read

RAG Ingestion: Batch and Incremental Update Strategies

RAG ingestion isn't just about loading data; it's about intelligently managing its lifecycle to keep your retrieval system sharp and responsive.

3 min read

Keep Your RAG Knowledge Base Fresh: Update Strategies

The most surprising truth about keeping a Retrieval Augmented Generation RAG knowledge base fresh is that the "freshness" problem isn't about how often .

5 min read

Late Chunking in RAG: Preserve Context Across Chunks

Late chunking fundamentally breaks the continuity of information by splitting documents at arbitrary points, making it impossible for a RAG system to re.

7 min read

Optimize RAG Latency: Hit P99 Targets in Production

Retrieval Augmented Generation RAG often feels like a black box where latency just happens, but the real secret is that most of the P99 tail is usually .

4 min read

RAG LLM Cache: Semantic Deduplication for Speed

Retrieval Augmented Generation RAG LLM caches are often described as simply storing past queries and their results, but their real power, and a signific.

3 min read

Long Context vs RAG: When Each Approach Wins

Long context windows are surprisingly often worse than RAG for tasks requiring factual recall. Imagine you're trying to answer a question about a specif.

4 min read

RAG Metadata Filtering: Query Structured Data Precisely

RAG metadata filtering lets you go beyond simple keyword matching to retrieve documents based on precise, structured data attributes, dramatically impro.

3 min read

Monitor RAG Retrieval Quality in Production

Retrieval Augmented Generation RAG systems, when deployed in production, face a unique challenge: the quality of the retrieved context directly dictates.

8 min read

RAG Multi-Query: Generate Query Variants with an LLM

RAG Multi-Query doesn't just generate more questions; it fundamentally changes how retrieval works by treating search as a language problem, not a keywo.

3 min read

RAG Multi-Tenant: Isolate Data Between Customers

A multi-tenant RAG system can actually provide stronger data isolation than a single-tenant setup, if designed correctly.

3 min read

Multimodal RAG: Retrieve Across Images and Text

Retrieval augmented generation RAG typically treats text and images as completely separate entities, but what if they could talk to each other.

3 min read

RAG Open Source vs Managed: Compare Costs and Trade-offs

Open-source RAG solutions can be cheaper than managed services, but the total cost of ownership TCO often favors managed services due to hidden operatio.

3 min read

RAG Parent-Child Retrieval: Expand Context on Demand

Retrieval Augmented Generation RAG often struggles with retrieving only the most relevant snippets, leading to either too much noisy context or too litt.

3 min read

RAG PDF Ingestion: Parse Tables, Images, Complex Layouts

PDFs are often treated as opaque blobs, but the real magic is how a RAG system can coax structured data out of them, even when the layout is a mess.

3 min read

RAG Production Pipeline: Reliable Architecture Patterns

The most surprising thing about RAG production pipelines is that their reliability often hinges on what you don't retrieve, not just what you do.

2 min read

RAG Query Routing: Direct Queries to the Right Index

RAG query routing is all about ensuring that when a user asks a question, the right piece of information is retrieved from your knowledge base, and not .

2 min read

RAG Query Transformation: Rewrite Queries for Better Recall

The most surprising thing about query rewriting for RAG is that the LLM often makes your search worse if you don't guide it precisely.

3 min read

RAG Fusion: Merge Rankings with Reciprocal Rank Fusion

RAG Fusion is a technique that combines multiple search results to produce a single, more relevant ranking, often outperforming individual search method.

3 min read

RAG Reranking: Cohere and Cross-Encoders for Precision

Reranking with Cohere and cross-encoders is surprisingly effective because it shifts the focus from retrieving any relevant document to retrieving the m.

4 min read
ADHDecode

Complex topics, finally made simple

Courses

  • Networking
  • Databases
  • Linux
  • Distributed Systems
  • Containers & Kubernetes
  • System Design
  • All Courses →

Resources

  • Cheatsheets
  • Debugging
  • Articles
  • About
  • Privacy
  • Sitemap

Connect

  • Twitter (opens in new tab)
  • GitHub (opens in new tab)

Built for curious minds. Free forever.

© 2026 ADHDecode. All content is free.

  • Home
  • Learn
  • Courses
Esc
Start typing to search all courses...
See all results →
↑↓ navigate Enter open Esc close