For my application we do a land-and-expand strategy, where we use a mix of BM25 and semantic search to find a chunk, but before showing it to the LLM we then expand to include everything on that page.
It works pretty well. It might benefit from including some material on the page prior and after, but it mostly solves the "isolated chunk" problem.
It works pretty well. It might benefit from including some material on the page prior and after, but it mostly solves the "isolated chunk" problem.