Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What's interesting is we're currently training AI on how to use vector databases. Next generation LLMs trained on GitHub from the era of LangChain and FOSS vector DBs will be able to self-program their own long term memory recall. I don't think that just chunking and storing vectors for all of the text the LLM reads is the best approach, but it might be able to apply a strategy to each unique situation that is more optimal.


That's fascinating. As a novice to the problem, are there any resources you could link about this? I'm new to studying AI- but I've been prototyping connecting GPT to a nonrelational database to serve as a stand in for long term memory. My problem so far utilizing GPT3 has been difficulty getting it to use any consistent schema, as it will write to the database in a generated schema but try to recall in another. This is the first I've heard about using vector databases for the task.


I don't think chunking is the most optimal approach either. I can envision a future where embeddings have variable length and comparing two variable-length embeddings would require a more complex similarity metric. Vector databases will need to adjust to this reality.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: