Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I am a medical students with thousands and thousands of PDF and was unsatisfied with RAG tools so I made my own. It can consume basically any type of content (pdf, epub, youtube playlist, anki database, mp3, you name it) and does a multi step RAG by first using embedding then filtering using a smaller LLM then answering using by feeding each remaining document to the strong LLM then combine those answers.

It supports virtually all LLMs and embeddings, including local LLMs and local embedding It scales surprisingly well and I have tons of improvements to come, when I have some free time or procrastinate. Don't hesitate to ask for features!

Here's the link: https://github.com/thiswillbeyourgithub/DocToolsLLM/



Nvidia's 'Chat with RTX' can do this as well https://www.nvidia.com/en-us/ai-on-rtx/chatrtx/

You do need a beefy GPU to run the local LLM, but I think it's a similar requirement for running any LLM on your machine.


I am deeply unsatisfied with how most RAG systems handle questions, chunking, embeddings, storage, and even those used for summaries are usually rubbish. That's why I created my own tool. Check it out I updated it a lot! It supports ollama too for private use.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: