Hacker News new | past | comments | ask | show | jobs | submit login

As is typical with any RAG strategy/algorithm, the implicit thing is it works on a specific dataset. Then, it solves a very specific use case. The thing is, if you have a dataset and a use case, you can have a very custom algorithm which would work wonders in terms of output you need. There need not be anything generic.

My instinct at this point is, these algos look attractive because we are constrained to giving a user a wow moment where they upload something and get to chat with the doc/dataset within minutes. As attractive as that is, it is a distinct second priority to building a system that works 99% of the time, even if takes a day or two to set up. You get a feel of the data, have a feel of type of questions that may be asked, and create an algo that works for a specific type of dataset-usecase combo (assuming any more data you add in this system would be similar and work pretty well). There is no silver bullet that we seem to be searching for.




100% agree with you. I've built a # of RAG systems and find that simple Q&A-style use cases actually do fine with traditional chunking approaches.

... and then you have situations where people ask complex questions with multiple logical steps, or knowledge gathering requirements, and using some sort of hierarchical RAG strategy works better.

I think a lot of solutions (including this post) abstract to building knowledge graphs of some sort... But knowledge graphs still require an ontology associated to the problem you're solving and will fail outside of those domains.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: