I'm sorry, I don't understand what you mean. I checked the original article agai...

danielhanchen · 2025-01-11T01:23:06 1736558586

Are you referring to the finetuning part?

The multiple bug fixes are separate from the finetuning sections - Unsloth itself makes finetuning 2x faster and use 70% less memory - the bug fixes are totally detached from finetuning - ie you can take the fixed version we uploaded at https://huggingface.co/unsloth/phi-4, and use it in any framework or inference engine.

Apologies I'm confused on the comment sorry.

If you're questioning the credibility of the bug fixes - we fixed 8 bugs in Gemma https://x.com/danielhanchen/status/1765446273661075609, multiple bugs in Llama, Mistral, Qwen, a gradient accumulation bug https://x.com/danielhanchen/status/1846235913443262891 and much more

grumpopotamus · 2025-01-11T04:29:52 1736569792

2x faster than what?

danielhanchen · 2025-01-11T04:40:09 1736570409

Oh 2x faster and uses >70% less memory than Hugging Face + Flash Attention 2! I did a CUDA / GPU Mode talk about it here: https://www.youtube.com/watch?v=hfb_AIhDYnA Also to the PyTorch team here: https://www.youtube.com/watch?v=MQwryfkydc0 and the PyTorch Conference here: https://www.youtube.com/watch?v=PdtKkc5jB4g

danielhanchen · 2025-01-11T02:36:52 1736563012

Update - the Phi-4 team is working on adding all our fixes to the original model! https://huggingface.co/microsoft/phi-4/discussions/21