Hacker News new | past | comments | ask | show | jobs | submit | maciejgryka's comments login

I’d encourage everyone, who finds this appealing to check out how Ecto works in Elixir. It’s all functional & immutable goodies and pipelines are built into the language and idiomatic.

Definitely looks weird at first glance (and Ecto is kinda weird even if you’re familiar with Elixir), but it’s such a joy to use once you grok it.


Highest-end fidget spinner I’ve ever seen. Instantly appealing to my inner 6-year-old.


Yes, compute is absolutely the limiting factor today. Not only because the space of hyperparameters is huge and having more compute would make it easier/possible to explore. But also, weirdly, because inference becomes increasingly important for training, which means even more compute! A lot of work these days goes into getting better data and it turns out that using an existing large model to create/curate data for you works really well.


> “Retrieval-Augmented Generation” is nothing more than a fancy way of saying “including helpful information in your LLM prompt.”


For sure, it's only worth doing if you actually have so much relevant data that it doesn't fit in the context! This is definitely the case for us for this problem, but it's not universal.


To me the RAG hype is just the sudden rediscovery of information retrieval by money hounds that did not care about AI/ML for decades and now are in panic mode due to FOMO.


There's hype and FOMO for sure and you're right that there's lots to learn from information retrieval work. But why be dismissive of the whole thing? People learning from past research and applying it in new contexts seems like a good thing and there's people (like https://x.com/bclavie and https://x.com/jobergum), who are legit experts in information retrieval and are showing how to apply it to RAG properly.


There are knowledgeable people out there with excellent use-cases for RAG, to find data that is relevant. And they come from years of using the previous state of the art techniques in information retrieval. They have all my support and I am looking forward to read more about it.

But the majority of talk about RAG is coming from hype driven individuals that produce low merit and uninformed opinions.


This is one of the things we learned recently about building production workflows with LLMs. Happy to answer any questions/feedback here <3


For me it's nice to see the re-use of existing infra - big query. Personally, I was rooting for Postgres, but your logic of why not makes sense! Great post.


I have no actual info on this, but I always assumed they'd compute some mutlimodal embeddings of the screenshots to then retrieve semantically-relevant ones by text? And yeah, they'd have to do it using on-device models, which doesn't seem out of reach?


I recorded myself trying to read through and understand the high-level of this if anyone's interested in following along: https://maciej.gryka.net/papers-in-public/#scaling-monoseman...


Isn't "two models slapped together" basically how all of these things work, starting with CLIP? Not sure about GPT4o, obviously, I don't think they released any underlying architecture details?


Your understanding is correct, even GPT4o will have an encoder model.


what even is “a model”? I’m not sure there is a technical definition that corresponds to how it’s used by the tech public

- single interconnected neural network (LLM attention layers break this, autoencoders complicate this)

- single training pass (LLMs have multiple passes, GANs have a single but produce multiple models)


> - single training pass (LLMs have multiple passes, GANs have a single but produce multiple models)

LLMs have multiple passes? wdym?


I haven’t tried this yet, excited to see how it can do segmentation by outputting series of coordinates! That's something I just assumed transformers will generally be bad at.


How it does this is really cool. It’s got a VAE decoder. Reminds me a lot of how SAM works.


Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: