Given the history of cinema, I predict that this will be used excessively by some directors, who will fall into an "uncanny valley" the perceptive and intelligent will find disturbing while others will find it to be a "super-stimulus."
Then the industry will mature, and this will be used with greater subtlety.
Is the intention to use it in real life cinema? I made the assumption it would be used more in animation (especially mo-cap) - which will still fall into the uncanny valley traps, but it should make for more smooth transitions between emotions for these types of outputs.
This reminds me of the automatic pitch correction (auto-tuning) in the recent pop music. Hopefully this will not signify the dusk of the art of acting.
I don't think so, the actor is supplying the 'raw materials', this is another tool for the director to achieve the truest representation of their vision, actors might appreciate not having to do an entire retake due to one duff line, they still get paid :)
And the next step will be to show different variations to multiple focus groups, then train a neural network to automatically select the set of blending weights for the film that are projected to maximize revenue, thereby automating the post production as well.
maybe, that sounds pretty dire though, how about interactive gaming where your input affects the mood of the conversation without the start/stop clip style used now.
Faces might be blending but the audio isn't exactly something I enjoy, pretty wild that's how it sounds. Could surely pick a favorite audio track, or, a la Max Martin, comp all the takes for material knowing the digital warp will be passable. Slick technique, I'm definitely impressed with the the show of tech and capability.
This was more focused on the video blending (and how it's synchronized with the audio cues). The clipping is from a naïve audio speed algorithm (the other naïve approach results in pitch shifting).
One thing this probably does give, though, is the necessary curve to output a well-formed audio blend.
Most likely the directors / editors who use this would overdub the audio once they get the visual take they want. They already overdub audio pretty often anyway, due to set noise.
The mad + sad takes seemed to give a lot more emotional nuance than either of them separate. I like this, but hope it doesn't become as over-abused as autotune is.
This is really impressive. I can't see any seams in the merged output, though I do see the actors' eyes creepily going out of alignment (a fixable problem).
Then the industry will mature, and this will be used with greater subtlety.