Beginner's Guide to Llama Models

jmorgan · on Aug 12, 2023

An easy way to try many of the fine-tuned Llama 2 models is https://github.com/jmorganca/ollama.

A maintainer of the project has been collecting a full list here (with different quantization levels), most of which are Llama 2-based: https://gist.github.com/mchiang0610/b959e3c189ec1e948e4f6a1f...

Since the release of Llama 2 the number of models based on it has been growing significantly.. some popular ones:

- codeup (A code generation model - DeepSE)

- llama2-uncensored (George Sung)

- nous-hermes-llama2 (Nous Research)

- wizardlm-uncensored (WizardLM)

- stablebeluga (Stability AI)

The article also recommends oobabooga's text-generation-webui which includes a full web dashboard.

xcdzvyn · on Aug 12, 2023

Wow, thanks! I really like Ollama, so I'm glad the models listed at the top of the Readme aren't all you could use. Do you know where the data in that Gist came from? Is there a registry somewhere?

jmorgan · on Aug 12, 2023

There is! While not easy to use yet, there's a sort-of-hidden way models can be listed with:

  curl https://ollama.ai/v2/_catalog | jq

Then to list "tags" for a given model (e.g. llama2):

  curl https://ollama.ai/v2/library/llama2/tags/list | jq

xcdzvyn · on Aug 13, 2023

Oh, you're the maintainer! Hahaha my bad. Thanks :)

danielbln · on Aug 12, 2023

llama2-uncensored is a lot of fun, interacting with an unaligned/uncensored model feels very fresh.

thatguymike · on Aug 12, 2023

I find the whole site being covered in images of skinny waifu girls... offputting. The guide itself seems fine, if high-level. [This](https://chat.lmsys.org/?leaderboard) is a really nice link I hadn't seen before.

raincole · on Aug 12, 2023

I don't mind waifu girls, but I really don't know how these images are related to Llama. It's not an article about Stable Diffusion right...?

Huge content farm vibe.

Edit: I read the article carefully. Yeah, not just content farm vibe. It's a content farm.

> What can you do with Llama models?

> You can use Llama models the same ways you use ChatGPT.

> Chat. Just ask questions about things you want to know.

> Coding. Ask for a short program to do something in a specific computer language.

> Outlines. Giving an outline of certain technical topics.

> Creative writing. Let the model write a story for you.

> Information extraction. Summarize an essay. Ask specific questions about an essay.

> Rewrite. Write your paragraph in a different tone and style.

Obvious padding content for SEO. Flagged.

smeej · on Aug 12, 2023

I only got a couple paragraphs in before thinking, "This was written by a fluff AI, and probably not even Llama." I think Llama has better English proficiency.

iinnPP · on Aug 12, 2023

Maybe they just like anime a lot? Hanlon's razor and all.

mufti_menk · on Aug 12, 2023

I found them beautiful

cypress66 · on Aug 12, 2023

Yeah, a lot of them are very cool.

brabel · on Aug 12, 2023

Same here... the images are beautiful but seem completely unrelated to the topic (other than being AI-generated - but they could've generated images that have some sort of relation to the topic instead!). It kind of shows what the author has been using the AI for, I suppose :P.

kristofferg · on Aug 12, 2023

I would usually agree but in this case its a pun on “models”. I find it quite funny.

brabel · on Aug 12, 2023

Oh I see... and a "model" is always young, skinny, white and female I guess (cannot blame the AI though, I guess that is what you get if you look at the internet as a whole as your training data).

EDIT: I had to lookup "waifu" and it seems the author probably prompted for "waifu model"(?)... according to Wikipedia, for those like me who are not into that sort of thing:

"A Waifu is an illustrated female character from an anime or any non-live action media in which an individual becomes sexually attracted to."

kristofferg · on Aug 12, 2023

You are right in that is neither high-brow or PC. :)

mhaberl · on Aug 12, 2023

> skinny waifu girls

I think the idea of the images was a Lama (the animal) and a model (the girl) - Lama models

> The guide itself seems fine

The piece is informative for those new to the subject, but not much beyond that.

bdavbdav · on Aug 12, 2023

In my presentations about it at work, I’ve been generating pictures of my dog doing things to illustrate various things. No different I suppose.

Anon4Now · on Aug 12, 2023

I thought they were a bit over the top, but what really caught my eye was how creepy looking that first llama is.

milar · on Aug 12, 2023

This is literally AI generated txt and gen’d images.

HN isn’t prepared yet.

Prepare for 100x more of these.

raincole · on Aug 12, 2023

Checked the submitter's history. Clearly "spam my content farm posts on HN and hope one of them gets lucky".

ulnarkressty · on Aug 12, 2023

I tried multiple flavors of llama models, they are all quite dumb. Even the 70b parameter one. It knows about more things which the smaller models just hallucinate when asked, but still cannot do even slightly more complex tasks.

I'm also not sure about the current testing methodologies i.e. the 'passed the SAT' hype. Given that the training set already contains much of the information, we should probably compare the AI results with humans having unlimited time and access to the required material.

smcin · on Aug 12, 2023

Post us some sample prompts and answers please.

cowthulhu · on Aug 12, 2023

Does anyone know how well a fine-tuned Llama model will do compared to GPT 4 on complex tasks?

soultrees · on Aug 12, 2023

From what I understand is that it’s not really close for anything involving creative reasoning but for basic instruct is on par.

milar · on Aug 12, 2023

GPT4 exceeds all of them.

Wait til next year, as the Brooklyn dodgers said

yu3zhou4 · on Aug 12, 2023

Can you recommend another systematized guides like this about LLMs and ML in general? Even though was quite limited in info, I like the structured and concise form and I’d be happy to learn about similar blog posts

jmorgan · on Aug 12, 2023

I really enjoyed Anrej Kaparthy's llama2.c project (https://github.com/karpathy/llama2.c), which runs through creating and running a miniature Llama2 architecture model from scratch.

marcopicentini · on Aug 12, 2023

Can I train a llama 2 model on custom data and make it expert on my data knowledge?

For example, If I give it thousands of law pages it will be a domain expert about law ?

avion23 · on Aug 12, 2023

I've evaluated some models. The best ones are based on llama-2. Good is for example codeup-llama-2-13b-chat-hf

Uncensored (partially) is nous-hermes-llama2-13b

tony12345678 · on Aug 17, 2023

As a Mac user, I think Ollama is amazing. Thank you! :). Is there any chance that functionality could be added for fine-tuning (e.g. document/text file uploads)?

crossroadsguy · on Aug 12, 2023

What's a good and friendly place to start for a seasoned Android developer to get a peek into the world of AI/ML as we are seeing all this today (or specifically last few months maybe), who is kind of having a FOMO cum curiosity (and a little worry about career/future and all) about all this? To get a taste of things.

jbjbjbjb · on Aug 12, 2023

I tried running a llama model using oobabooga but kept running into one problem after the next.

Anyone know of a config that might work (even quite slowly) on an i9 laptop, 32gb ram with nvidia graphics 8gb?

MahdeenSky · on Aug 12, 2023

+1 uncensored models

neilv · on Aug 12, 2023

> How to install Llama models?

> See the installation guide for Windows and the installation guide for Mac.

Much "open" LLM/SD/etc. grassroots stuff seems to be shooting themselves in the face, by pushing others to closed platforms.

Now is one of the times to be increasing pressure for open platforms, not backsliding.

abwizz · on Aug 12, 2023

yes, they are, but it seems a good fit nonetheless

speedgoose · on Aug 12, 2023

What are the hardware requirements to fine tune llama ?

bart__ · on Aug 12, 2023

~20 GB vram for the 7B model and 48 GB for the 13B model. It depends on the context size as well. I'd recommend renting a 4090 from a cloud provider like runpod/vast ai to get started, using a PEFT tutorial.

speedgoose · on Aug 12, 2023

Thanks. What about the 70B model? I assume a 4090 will not be enough. Is it linear system requirements ?

bart__ · on Aug 12, 2023

4090 only has 24 GB and will only be able to fine tune (and merge, which is more memory intensive) the 7B model. The RTX6000 with 48 GB is able to fine tune the 13B model. The 70B model presumably needs multiple GPUs, like 4 RTX6000. For people starting out, you can also use a free GPU from Google colab to fine tune a 7B model. Finetuning 70B gets more expensive and I would suggest trying smaller models first with a high quality dataset.

It is mostly linear I think.

speedgoose · on Aug 12, 2023

Thanks. My plan is to use this research cluster: https://www.ex3.simula.no/resources

I will probably train how to fine tune on the small model but I don’t really need to use a worse model to save money.