My mac mini is heavily constrained due to only having 2 proper cores and not much RAM. So the smallest models run best. The quantized tiny runs better than the regular tiny simply due to memory pressure.
So, in my testing on my mac mini it tends to take about 30% more time to process than the audio clip that was recorded. I added a specific warning that lets the user know to reduce their thread count if the processing time takes longer than a specific threshold.
Some of my stream viewers report it being pretty fast though. (much faster than what they see on my mac mini).