Are you planning on adding documentation / a mechanism for running a prompt using the code this installs?
As far as I can tell at the moment it clones one of the various repos for you and downloads some model writes, but it doesn't yet help you compile and run the code.
hello, I am planning to add also a runner option and a benchmarking option among different implementations. This is just an MVP version while trying to keep track of all llama2 implementations as the ones in the original repo is a bit outdated
most of the models support the tinyllamas, regarding gguf/ggml and safetensors each implementation has its own model importers, so there is not guarantee that all types can be consumed by all implementations
As far as I can tell at the moment it clones one of the various repos for you and downloads some model writes, but it doesn't yet help you compile and run the code.