Tutorials › Generative AI Engineering › Advanced RAG Techniques

Advanced RAG Techniques

6 min read

Advanced RAG: hybrid search, HyDE, query rewriting, cross-encoder re-ranking, and parent document retrieval.

Advanced RAG Techniques

# 1. Hybrid Search (BM25 keyword + vector semantic)
# Combine scores: hybrid = 0.5 * bm25 + 0.5 * vector
# Better coverage for both exact terms and semantics

# 2. HyDE (Hypothetical Document Embeddings)
# Generate a hypothetical answer, embed IT, then search
hyde_prompt = 'Write a short answer to: ' + question
hypothetical = llm.complete(hyde_prompt)
embedding = embed(hypothetical)  # search with this not raw question

# 3. Query Rewriting
rewrite = llm.complete('Rewrite for better search retrieval: ' + q)

# 4. Re-ranking with cross-encoder
from sentence_transformers import CrossEncoder
reranker = CrossEncoder('cross-encoder/ms-marco-MiniLM-L-6-v2')
pairs    = [[question, doc] for doc in retrieved_docs]
scores   = reranker.predict(pairs)
reranked = sorted(zip(retrieved_docs, scores), key=lambda x: x[1], reverse=True)

# 5. Parent Document Retriever
# Index small chunks (precise retrieval)
# Return parent chunk (full context for generation)

# 6. RAPTOR
# Cluster chunks, summarise clusters, index at multiple levels
# Enables answering both specific and high-level questions

← Building a RAG Pipeline Next: LangChain Fundamentals →

Quick Access

Advanced RAG Techniques

Advanced RAG Techniques