Computing gradients is easy/cheap. What this technique solves is that you no lon...

jprafael on March 24, 2023 | parent | context | favorite | on: LoRA: Low-Rank Adaptation of Large Language Models

Computing gradients is easy/cheap. What this technique solves is that you no longer need to store the computed values of the gradient until the backpropagation phase, which saves on expensive GPU RAM, allowing you to use commodity hardware.