You can view RAG as a bigger word2vec. The canonical example being "king - man + woman = queen". Words, or now chunks, have geometric distribution, cluster, and relationships... on semantic levels
What is happening is that text is being embedded into a different space, and that format is an array of floats (a point in the embedding space). When we do retrieval, we embed the query and then find other points close to that query. The reason for Vector DB is (1) to optimize for this use-case, we have many specialized data stores / indexes (redis, elastic, dolt, RDBMS) (2) often to be memory based for faster retrieval. PgVector will be interesting to watch. I personally use Qdrant
Full-text search will never be able to do some of the things that are possible in the embedding space. The most capable systems will use both techniques
What is happening is that text is being embedded into a different space, and that format is an array of floats (a point in the embedding space). When we do retrieval, we embed the query and then find other points close to that query. The reason for Vector DB is (1) to optimize for this use-case, we have many specialized data stores / indexes (redis, elastic, dolt, RDBMS) (2) often to be memory based for faster retrieval. PgVector will be interesting to watch. I personally use Qdrant
Full-text search will never be able to do some of the things that are possible in the embedding space. The most capable systems will use both techniques