Wow, thanks! I really like Ollama, so I'm glad the models listed at the top of the Readme aren't all you could use. Do you know where the data in that Gist came from? Is there a registry somewhere?
I find the whole site being covered in images of skinny waifu girls... offputting. The guide itself seems fine, if high-level. [This](https://chat.lmsys.org/?leaderboard) is a really nice link I hadn't seen before.
I only got a couple paragraphs in before thinking, "This was written by a fluff AI, and probably not even Llama." I think Llama has better English proficiency.
Same here... the images are beautiful but seem completely unrelated to the topic (other than being AI-generated - but they could've generated images that have some sort of relation to the topic instead!). It kind of shows what the author has been using the AI for, I suppose :P.
Oh I see... and a "model" is always young, skinny, white and female I guess (cannot blame the AI though, I guess that is what you get if you look at the internet as a whole as your training data).
EDIT: I had to lookup "waifu" and it seems the author probably prompted for "waifu model"(?)... according to Wikipedia, for those like me who are not into that sort of thing:
"A Waifu is an illustrated female character from an anime or any non-live action media in which an individual becomes sexually attracted to."
I tried multiple flavors of llama models, they are all quite dumb. Even the 70b parameter one. It knows about more things which the smaller models just hallucinate when asked, but still cannot do even slightly more complex tasks.
I'm also not sure about the current testing methodologies i.e. the 'passed the SAT' hype. Given that the training set already contains much of the information, we should probably compare the AI results with humans having unlimited time and access to the required material.
Can you recommend another systematized guides like this about LLMs and ML in general? Even though was quite limited in info, I like the structured and concise form and I’d be happy to learn about similar blog posts
I really enjoyed Anrej Kaparthy's llama2.c project (https://github.com/karpathy/llama2.c), which runs through creating and running a miniature Llama2 architecture model from scratch.
As a Mac user, I think Ollama is amazing. Thank you! :). Is there any chance that functionality could be added for fine-tuning (e.g. document/text file uploads)?
What's a good and friendly place to start for a seasoned Android developer to get a peek into the world of AI/ML as we are seeing all this today (or specifically last few months maybe), who is kind of having a FOMO cum curiosity (and a little worry about career/future and all) about all this? To get a taste of things.
~20 GB vram for the 7B model and 48 GB for the 13B model.
It depends on the context size as well. I'd recommend renting a 4090 from a cloud provider like runpod/vast ai to get started, using a PEFT tutorial.
4090 only has 24 GB and will only be able to fine tune (and merge, which is more memory intensive) the 7B model. The RTX6000 with 48 GB is able to fine tune the 13B model. The 70B model presumably needs multiple GPUs, like 4 RTX6000.
For people starting out, you can also use a free GPU from Google colab to fine tune a 7B model. Finetuning 70B gets more expensive and I would suggest trying smaller models first with a high quality dataset.
A maintainer of the project has been collecting a full list here (with different quantization levels), most of which are Llama 2-based: https://gist.github.com/mchiang0610/b959e3c189ec1e948e4f6a1f...
Since the release of Llama 2 the number of models based on it has been growing significantly.. some popular ones:
- codeup (A code generation model - DeepSE)
- llama2-uncensored (George Sung)
- nous-hermes-llama2 (Nous Research)
- wizardlm-uncensored (WizardLM)
- stablebeluga (Stability AI)
The article also recommends oobabooga's text-generation-webui which includes a full web dashboard.