Home/
Part XIII — Expert Mode: Systems, Agents, and Automation/41. Advanced RAG (Beyond the Basics)/41.4 Context packing: selecting the best chunks under budget
41.4 Context packing: selecting the best chunks under budget
Overview and links for this section of the guide.
On this page
The Knapsack Problem
You have 30k tokens of context. You retrieved 50 documents (100k tokens). Which ones do you keep?
Packing Strategies
- Rerank: Use a cross-encoder to score relevancy. Keep top N.
- Max Marginal Relevance (MMR): Pick result 1. Pick result 2 that is different from result 1. Pick result 3 that is different from 1 and 2. This prevents "5 versions of the same doc" from clogging the context.
- Summarize: Ask a cheap model to summarize the bottom 20 docs into 1k tokens. Keep top 5 full docs.