Hello HN, I'm Owen from SciPhi (
https://www.sciphi.ai/), a startup working on simplifying˛Retrieval-Augmented Generation (RAG). Today we’re excited to share R2R (
https://github.com/SciPhi-AI/R2R), an open-source framework that makes it simpler to develop and deploy production-grade RAG systems.
Just a quick reminder: RAG helps Large Language Models (LLMs) use current information and specific knowledge. For example, it allows a programming assistant to use your latest documents to answer questions. The idea is to gather all the relevant information ("retrieval") and present it to the LLM with a question ("augmentation"). This way, the LLM can provide answers (“generation”) as though it was trained directly on your data.
The R2R framework is a powerful tool for addressing key challenges in deploying RAG systems, avoiding the complex abstractions common in other projects. Through conversations with numerous developers, we discovered that many were independently developing similar solutions. R2R distinguishes itself by adopting a straightforward approach to streamline the setup, monitoring, and upgrading of RAG systems. Specifically, it focuses on reducing unnecessary complexity and enhancing the visibility and tracking of system performance.
The key parts of R2R include: an Ingestion Pipeline that transforms different data types (like json, txt, pdf, html) into 'Documents' ready for embedding. Next, the Embedding Pipeline takes text and turns it into vector embeddings through various processes (such as extracting text, transforming it, chunking, and embedding). Finally, the RAG Pipeline follows the steps of the embedding pipeline but adds an LLM provider to create text completions.
R2R is currently in use at several companies building applications from B2B lead generation to educational tools for consumers.
Our GitHub repo (https://github.com/SciPhi-AI/R2R) includes basic examples for application deployment and standalone use, demonstrating the framework's adaptability in a simple way.
We’d love for you to give R2R a try, and welcome your feedback and comments as we refine and develop it further!
The most key challenges I've faced around RAG are things like:
- Only works on text based modalities (how can I use this with all types of source documents, including images)
- Chunking "well" for the type of document (by paragraph, csvs including header on every chunk, tables in pdfs, diagrams, etc). The rudimentary chunk by character with overlap is demonstrably not very good at retrieval
- the R in rag is really just "how can you do the best possible search for the given query". The approach here is so simple that it is definitely not the best possible search results. It's missing so many known techniques right now like:
- Also other search approaches like fuzzy search/lexical based approaches. And ranking them based on criterial like (user query is one word, use fuzzy search instead of semantic search). Things like thatSo far this platform seems to just lock you into a really simple embedding pipeline that only supports the most simple chunk based retrieval. I wouldn't use this unless there was some promise of it actually solving some challenges in RAG.