The big stand out to me beyond almost any other text video solution is that the video duration is tremendously longer (minute+). Everything else that I've seen can't get beyond 15 to 20 seconds at the absolute maximum.
In terms of following the prompt and generating visually interesting results, I think they're comparable. But the resolution for Sora seems so far ahead.
Worth noting that Google also has Phenaki [0] and VideoPoet [1] and Imagen Video [2]