>when it is well-known in the field that parameter-efficient fine-tuning always ...

arugulum · on March 24, 2023

I don't want to get into the weeds of the subtleties of evaluation, hyperparameter-tuning and model comparisons, but let's just say that subsequent studies have shown that LoRA (consistent with most parameter-efficient tuning methods) underperform full fine-tuning: https://arxiv.org/abs/2203.06904

As simple way to think about it is this: if LoRA really gives full fine-tuning performance, why would anyone ever fully fine-tune a model?

GaggiX · on March 24, 2023

>why would anyone ever fully fine-tune a model?

You're asking it as if it were a rhetorical question, but I think it carries more weight than many people seem to believe.

arugulum · on March 24, 2023

To balance my view a little, it is definitely a valid question to ask "how far can we get with parameter-efficient tuning", and I firmly believe that as models get larger, the answer is "very, very far".

That said, I also dislike it when it is carelessly claimed that parameter-efficient tuning is as good as full fine-tuning, without qualifications or nuance.

deepsquirrelnet · on March 25, 2023

I agree that it does carry weight.

It is not apparent to me that fine tuning should be better, especially since the LoRA method seems like it could be robust against catastrophic forgetting.

GaggiX · on March 25, 2023

I 100% agree with you, but I (we) need more evidence.