Yeah that's a fairly well studied one. Most of these techniques are rather "lossy" compared to extending the context window. The most likely "real solution" is going to be using various tricks and finetuning on higher context lengths to just extend the context window.
Here's a bunch of other related methods,
Summarizing context - https://arxiv.org/abs/2305.14239
continuous finetuning - https://arxiv.org/pdf/2307.02839.pdf
retrieval augmented generation - https://arxiv.org/abs/2005.11401
knowledge graphs - https://arxiv.org/abs/2306.08302
augmenting the network a side network - https://arxiv.org/abs/2306.07174
another long term memory technique - https://arxiv.org/abs/2307.02738