Open source Kaldi gives you 7.8%, Microsoft didn't went too far.
Also, major issue with this kind of research is that they combined several systems in order to get best results. Most practical systems don't use combinations, they are too slow.
So this model won't be free software??? Odd, and bummer...
Also I'm note sure an error reduction of 20% (1-6.3/7.8) is to be considered small; depends on the particular challenge really. Like, sentiment analysis only starts to get interesting above 80% on some dataset, as much can be guessed correctly in very naive ways..
Human lvl on this task is estimated to be ~4% so we have quite a lot of ground to cover still..
Also, major issue with this kind of research is that they combined several systems in order to get best results. Most practical systems don't use combinations, they are too slow.