Hacker News new | past | comments | ask | show | jobs | submit | paraph1n's comments login

Could someone point me towards a good resource for learning how to build a RAG app without llangchain or llamaindex? It's hard to find good information.


At a fundamental level, all you need to know is:

- Read in the user's input

- Use that to retrieve data that could be useful to an LLM (typically by doing a pretty basic vector search)

- Stuff that data into the prompt (literally insert it at the beginning of the prompt)

- Add a few lines to the prompt that state "hey, there's some data above. Use it if you can."


You can start by reading up about how embeddings work, then check out specific rag techniques that people discovered. Not much else is needed really.


Here's a blog post that I just pushed that doesn't use them at all - https://blog.dagworks.io/p/building-a-conversational-graphdb (we have more on our blog - search for RAG).

[disclaimer I created Hamilton & Burr - both whitebox frameworks] See https://www.reddit.com/r/LocalLLaMA/comments/1d4p1t6/comment... for comment about Burr.


My strategy has been to implement in / follow along with llamaindex, dig into the details, and then implement that in a less abstracted, easily understandable codebase / workflow.

Was driven to do so because it was not as easy as I'd like to override a prompt. You can see how they construct various prompts for the agents, it's pretty basic text/template kind of stuff



Data centric on YouTube has some great videos . https://youtube.com/@data-centric?si=EOdFjXQ4uv02J774



openai cookbook! Instructor is a decent library that can help with the annoying parts without abstracting the whole api call - see it’s docs for RAG examples.


> The outcome provably does not exist until you measure it.

This is not true. It only provably does not exist in local hidden variables.


It doesn't change much in practice. If the event is influenced by a state outside of its past light cone, you (that is observer inside the universe) cannot predict the outcome even theoretically.


I believe "in practice, theoretically" simplifies to "theoretically".

In actual practice there might as well be hidden local variables here. You wouldn't be able to tell the difference, even though you could in theory.


I think it was a top level post, but my confidence is low.


What do I search for to find music/audio like this? It sounds so beautiful.



It reminded me a bit of Plastikman, for example https://www.youtube.com/watch?v=oQduttGOQSE

Probably because it uses the same 303's and 909's :)


How does it compare to zapatos?

https://jawj.github.io/zapatos/


We moved away from zapatos because the generated types are good only when selecting from single table. The moment we start selecting some subset of columns from a join of multiple tables, it is upto the developer to provide the right combination of pick and intersection of generated types and type safety takes a hit.

The solution we use right now is ts-sql-query [1] which supports automatic type-safety for complex joins, CTEs, subselects etc. I evaluated Kysely as well but found the sql feature set coverage of ts-sql-query better at the time.

I maintain a code-generator [2] for this project that can generate the table mappers from database schema similar to how zapatos.

We don't have as good support for lateral joins and deriving json from database though, which zapatos does really well.

[1] https://ts-sql-query.readthedocs.io/

[2] https://github.com/lorefnon/ts-sql-codegen


> which supports automatic type-safety for complex joins, CTEs, subselects etc. I evaluated Kysely as well but found the sql feature set coverage of ts-sql-query better at the time.

Kysely also provides "automatic type-safety for complex joins, CTEs, subselects etc.".

Gotta love how toxic some open-source maintainers are, bashing other libraries while self-promoting.


How does this compare with pgTyped[1]?

[1] https://github.com/adelsz/pgtyped


I like pgtyped - when the queries are mostly static it is a great solution.

Solutions like ts-sql-query are better when you need to dynamically generate complex sql. With ts-sql-query it is very easy to create sql select statements where multiple individual where clauses, or even joins are conditional based on the incoming filters.

You can choose to use stored procedures etc. for the more complex cases while keeping pgtyped for 80% of the less dynamic use cases. We decided not to go that route to keep most of the application in typescript which we are more comfortable with.


I'd suggest you try them both, and pick what you like better or what feels safer to bet on for your project.

I know some of our users use both, zapatos for codegen and kysely for querying.


GrapheneOS for phone


That's not quite true. The function call in question is

> ImageInstanceQ[x,"caprine animal",RecognitionThreshold->i/100]

I think it's less misleading to say it has a generic image recognition function that supports goats, among many other recognition targets.


Now this is an application of AI that I can get behind!


Highly unlikely? Tim cook literally said in the quote "You can never say never."


What about people who game on their laptop (at home) but don't need to game on the go? In that case an eGPU also sounds like a reasonable choice. They can just leave the eGPU at home and still be productive away from their desk.


Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: