Open source Kaldi gives you 7.8%, Microsoft didn't went too far. Also, major iss...

WilliamDhalgren · on Sept 15, 2016

So this model won't be free software??? Odd, and bummer...

Also I'm note sure an error reduction of 20% (1-6.3/7.8) is to be considered small; depends on the particular challenge really. Like, sentiment analysis only starts to get interesting above 80% on some dataset, as much can be guessed correctly in very naive ways..

Human lvl on this task is estimated to be ~4% so we have quite a lot of ground to cover still..

timgws · on Sept 15, 2016

There are 33% less errors with the Microsoft solution then with Kaldi... one could say that is quite significant.

_r5wf · on Sept 19, 2016

Relative decrease in WER is not so significant for lower percentages. How about "we make 6 errors on 100 words but Kaldi makes 7".

dave168 · on Sept 16, 2016

It is cool anyone can use CNTK to produce something similar now

nshm · on Sept 15, 2016

23% only

visarga · on Sept 15, 2016

Maybe they can distillate the ensemble in a compact and efficient version for production.

danielmorozoff · on Sept 14, 2016

Could you please provide a link?

nshm · on Sept 14, 2016

https://github.com/kaldi-asr/kaldi/blob/master/egs/fisher_sw...

In Microsoft paper http://arxiv.org/pdf/1609.03528v1.pdf Table 5 it's a line "Povey et al. [19] LSTM". http://www.isca-speech.org/archive/Interspeech_2016/pdfs/059... Interpolate it to RNNLM column and you'll get 7.8. See also http://www.isca-speech.org/archive/Interspeech_2016/pdfs/047...