if someone has to build this locally to fetch discussion where x topic was discu...

if someone has to build this locally to fetch discussion where x topic was discussed or find a person who had shown interest in certain x thing, how does one go about it?

One way of doing it is to embed messages with the added context of previous messages until the topic changes, otherwise, a simple similarity search of user prompt embedding would output messages of irrelevant topics since the context was included from the start.

Then embed the user prompt and perform a similarity search of either the user's query or create a hypothetical statement based on the prompt, also called HyDe approach. You ask an LLM to generate a hypothetical response given the query and then use its vector along with the query vector to enhance search quality.

For example, if the user query is - "find me who is interested in playing Minecraft on Tuesday", the llm will generate a response "I play Minecraft on Tuesdays" and we can search the vector of the llm output in the vector db which is all the messages along with their context.

However, I am not sure how this will work in scenarios where the user has sent a message asking "Will you play Minecraft on Tuesday", and person A has responded with "Yes". how can we have the model find person A? Shall we make a summary of each person based on the conversation with the user?

Also, the whole process might be computationally slow. how do we enhance the speed and performance?

(a noob here who wanted to build a similar solution)