This video is AMAZING and well worth watching. Skip ahead a minute or so if you have a very short attention span.
I most enjoyed watching Obama with Bush's facial expressions -- Bush is so expressive that it just looks incredibly wrong on Obama's face -- I wish they would run an Obama speech on Bush's face as well.
Honestly I was most impressed by that pairing. The software seemed to "get" Obama's "persona" when while Bush purses his bottom lip over his top lip, Obama had more of a pucker where the lips are merely pressed together. [0]
It seems like Obama's lip is given a different stiffness than Bush's. I wonder if that comes from software -- e.g. for personality or physical reasons, his lip is more stiff, or something else.
I'm thinking this technology demo will be spliced into 5s edits for least one prime time TV show themed around shadowy intelligence agencies by middle of next year.
Show writers will be able to incorporate this as a plot enabler for upcoming episodes, assuming they haven't already finished shooting the current seasons.
I don't know if you noticed, but they also ran the voice through some modulation to make Bush's voice sound like Obama as well. It was surprisingly convincing...
"...they all came to the realization that what made this place a success was not the collision-avoidance algorithms or the bouncer daemons or any of that other stuff. It was Juanita's faces." Neal Stephenson, Snow Crash.
I think in our world as well, VR will really take off when technology like this is applied.
I wonder if the rendering complexity is low enough to work in games? (~real time) Even with the prior knowledge of well-known celebrity expressions, these faces do seem much more expressive than those of current video game characters who have been modeled with motion capture.
Does it need to work in real time? The only person present while a game is running is usually the gamer. L.A. Noire did a great job with ahead-of-time facial modeling based on real people.
The eyes look dull because they're using very simple rendering techniques. I'd guess that they could make the eyes much more realistic by adding some simple optical effects like specularity or diffraction (http://www.graphics.cornell.edu/~westin/misc/fresnel.jpg).
One interesting thing that I noticed at 1:45 is that visible teeth can really change a person's appearance -- Bush doesn't seem to use his teeth at all when he talks.
In some cases, movement of the eyelids when blinking is tracked and reflected in the model, but the texture is not changed to a closed eye quickly enough.
Does anyone think this maybe a new way to immortality? Instead of just photos, soon we will have models that use video input of our family members, superimpose them over AI+visual models(like this), and we will be able to have conversations with them, even though they aren't real?
Also, I think you're thinking too limited...I think if this were perfected, and the physical technology put in place, most people will opt to look like Tom Hanks (or someone else as famous and attractive) than to look like themselves.
It makes you wonder about whether one day there will be a law against looking like Tom Hanks, or any celebrity. It would end up making crimes difficult to solve when the suspect is identified as Tom Hanks!
i don't get the W vs Obama split screen. The voice was changed slightly to sound like neither individuals... The Obama 3D face fails horribly when it tried to snicker a little bit... This is slightly better than FallOut 4 quality facial animation.
I think the point is the assembling of random photos/videos into a convincing representation of textures and animation (without devoted hardware, bringing the person in for a scan, etc), not that 'this is bleeding edge visuals'.
I think part of the effect is that they create a texture from photos - today's top games don't do this, they scan and model the face, so it looks higher-fidelity but less realistic (while this is very low fidelity). Mafia 1 used a similar technique to achieve really nice results, given how low-poly the models were back then.
Very amazing results and fantastic work! I noticed that all the examples are all front faces with at most 20 to 30 degree angel, but no profiles. I haven't read the paper, but does this mean it only reconstructs a 2D texture but not the 3D structure?
I'm not a lawyer but I would guess that there is no reason they need to change it. As long as they're not profiting from the likenesses of the celebrities they're emulating, then the paper is likely to be considered free speech that does not violate their privacy or publicity rights. See the legality of fair-use of someone's likeness. Some states might have different laws.
Real question : Is it really "amazing", "hyper-realistic" ?
I have some kind of prosopagnosia, I recognise people in video but I absolutely don't recognise modelled faces, its just looks like lo-poly models on Nintendo 64 for me.
The faces in the FIFA video games are already rendered with much more detail than this. This paper is not about detail but about "reconstruct[ing] a controllable model of a person from a large photo collection that captures his or her persona". It is a very specific technique to get to a realistic interactive facial model, not one to get to a very detailed realistic interactive facial model.
Looks like there's some interesting research on Voice Conversion out there, but I'm not knowledgeable enough to comment on how close we are to seeing it as an actual product. This pdf (from 10 years ago) seems interesting, though. Hmmm: http://nlp.lsi.upc.edu/papers/thesis_helenca.pdf
There's a Scottish company that does something like this but it's boutique and requires a large corpus. They recreated Roger Ebert's voice after he lost his jaw but it was only possible because of his vast recorded collection of movie reviews to draw on for phonemes.
I most enjoyed watching Obama with Bush's facial expressions -- Bush is so expressive that it just looks incredibly wrong on Obama's face -- I wish they would run an Obama speech on Bush's face as well.