You have to do that with LoRA regardless, to compute the gradients for the lowest-level LoRA weights.
You have to do that with LoRA regardless, to compute the gradients for the lowest-level LoRA weights.