Llama.cpp is a lot of C SIMD. I think using C++ elsewhere in the project just made sense in its infancy, when it was CPU only.
Rust is really interesting for stuff besides running the actual llm though. Python can be a huge performance/debugging pain when your frontend and such get huge.
Rust is really interesting for stuff besides running the actual llm though. Python can be a huge performance/debugging pain when your frontend and such get huge.