Tutorials › Generative AI Engineering › Chunking for RAG

Chunking for RAG

5 min read Quiz at the end

Choose fixed, recursive, semantic, or document-aware chunking — size 200-500 tokens with 10-20% overlap.

Document Chunking Strategies for RAG

How you split documents determines retrieval quality. Chunks too large lose precision; too small lose context.

# 1. Fixed-size with overlap (simple baseline)
def fixed_chunks(text: str, size=500, overlap=50):
    chunks = []
    for i in range(0, len(text), size - overlap):
        chunks.append(text[i:i+size])
    return chunks

# 2. Recursive character splitter (LangChain -- recommended)
from langchain.text_splitter import RecursiveCharacterTextSplitter
splitter = RecursiveCharacterTextSplitter(
    chunk_size=500,
    chunk_overlap=50,
    separators=['

','
','. ',' ','']
)
chunks = splitter.split_text(text)

# 3. Semantic chunking (split at topic boundaries)
# Compute embedding similarity between sentences
# Split when similarity drops below threshold

# 4. Document-aware chunking
# Markdown: split by heading (# ## ###)
# Code: split by class or function
# PDF: split by page or visual section break

# Best practice guidelines
# chunk_size:    200-500 tokens for most use cases
# chunk_overlap: 10-20% of chunk_size
# Add metadata: source, page, section title to each chunk
# Test retrieval quality with diverse questions

← LLM Observability Next: LangGraph for Agent Workflows →

Topic Quiz · 1 questions

Test your understanding before moving on

1. What does chunk overlap do in document chunking?

💡 Overlap preserves context at chunk boundaries so information is not lost when a sentence spans two chunks.

Quick Access

Chunking for RAG

Document Chunking Strategies for RAG

Test your understanding before moving on