Engineers and researchers from Samsung’s AI Center in Moscow and Skolkovo Institute of Science and Technology have created a model that can generate realistic animated talking heads from images without relying on traditional methods, like 3D modeling.
“Effectively, the learned model serves as a realistic avatar of a person,” said engineer Egor Zakharov in a video explaining the results.
Well-known faces seen in the paper include Marilyn Monroe, Albert Einstein, Leonardo da Vinci’s Mona Lisa, and RZA from the Wu Tang Clan. The technology that focuses on synthesizing photorealistic head images and facial landmarks could be applied to video games, video conferences, or digital avatars like the kind now available on Samsung’s Galaxy S10. Facebook is also working on realistic avatars for its virtual reality initiatives.
Such tech could clearly also be used to create deepfakes.
Few-shot learning means the model can begin to animate a face using just a few images of an individual, or even a single image. Meta training with the VoxCeleb2 data set of videos is carried out before the model can animate previously unseen faces.
During the training process, the system creates three neural networks: The embedded network maps frames to vectors, a generator network maps facial landmarks in the synthesized video, and a discriminator network assesses the realism and pose of the generated images.