Hacker News new | past | comments | ask | show | jobs | submit login

It’s sustainable. Llama.cpp is orders of magnitude more efficient than Llama because it’s in Rust.

OpenAI hasn’t even touched low level languages yet.




C++ not Rust. And most ML libraries being called from python or whatever are themselves C++ but of course there's always room for improvement, esp if you reimplement the entire corebase in C++ and specialize the inference program for your architecture. Specialization often leads to possibility of more efficient code. You are not running a generic neural net but you know the architecture beforehand so you can design its memory layout etc for efficiency yes.


Are you seriously not knowing what .cpp in llama.cpp stands for?




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: