RAG cli from llamaindex, allow you to do it 100% locally when used with ollama o...

homarp · on May 30, 2024

and at some point (https://github.com/ggerganov/llama.cpp/issues/7444) you will be able to use Phi-3-vision https://huggingface.co/microsoft/Phi-3-vision-128k-instruct

but for now you will have to use python.

You can try it here https://ai.azure.com/explore/models/Phi-3-vision-128k-instru... to get an idea of its OCR + QA abilities

nl · on May 31, 2024

Does the llamaindex PDF indexer correctly deal with multi-column PDFs? Most I've seen don't, and you get very odd results because of this.

rspoerri · on May 31, 2024

i've made quite good conversions from pdf to markdown with https://github.com/VikParuchuri/marker . it's slow but worth a shot. Markdown should be easily parseable by a rag.

i'm trying to get a similar system setup on my computer.

nl · on May 31, 2024

This looks worth exploring, so thanks. The author has done a bunch of work beyond what PyMuPDF does on multicolumn layouts.

pierre · on May 31, 2024

Locally you can choose pypdf or mupdf wich are good but not perfect. If you can send your data online llamaparse is quite good.

j45 · on May 31, 2024

Pulling the text out of the PDFs correctly and independently is correct.

jd3 · on May 30, 2024

basically, still the same answer(s) from

https://news.ycombinator.com/item?id=38759877

https://news.ycombinator.com/item?id=36832572

tspann · on May 31, 2024

https://milvus.io/docs/integrate_with_llamaindex.md

Pretty easy to run local and lightweight with Milvus Lite with LlamaIndex

ekianjo · on May 30, 2024

llamaindex has an horrible API, very poor docs and is constantly changing. I do not recommend it.

papichulo2023 · on May 31, 2024

Any alternative?

hm-nah · on May 31, 2024

Vanilla python

dmd · on May 31, 2024

So your solution to “I don’t like flying [specific airline]” would be “how about a big pile of aluminum and some jet fuel”?

vladsanchez · on May 31, 2024

LOL `papichulo`? Que tigre!?

papichulo2023 · on June 2, 2024

Jaja la primera palabra que se me vino a la cabeza