Advanced RAG: hybrid search, HyDE, query rewriting, cross-encoder re-ranking, and parent document retrieval.
Advanced RAG Techniques
# 1. Hybrid Search (BM25 keyword + vector semantic)
# Combine scores: hybrid = 0.5 * bm25 + 0.5 * vector
# Better coverage for both exact terms and semantics
# 2. HyDE (Hypothetical Document Embeddings)
# Generate a hypothetical answer, embed IT, then search
hyde_prompt = 'Write a short answer to: ' + question
hypothetical = llm.complete(hyde_prompt)
embedding = embed(hypothetical) # search with this not raw question
# 3. Query Rewriting
rewrite = llm.complete('Rewrite for better search retrieval: ' + q)
# 4. Re-ranking with cross-encoder
from sentence_transformers import CrossEncoder
reranker = CrossEncoder('cross-encoder/ms-marco-MiniLM-L-6-v2')
pairs = [[question, doc] for doc in retrieved_docs]
scores = reranker.predict(pairs)
reranked = sorted(zip(retrieved_docs, scores), key=lambda x: x[1], reverse=True)
# 5. Parent Document Retriever
# Index small chunks (precise retrieval)
# Return parent chunk (full context for generation)
# 6. RAPTOR
# Cluster chunks, summarise clusters, index at multiple levels
# Enables answering both specific and high-level questions