What Is Deepfake?
Deepfake is an AI-based technology used to produce or modify video content so that it presents something that has not happened.
The term is named for a Reddit user called deepfakes who, in December 2017, used an in-depth learning technology to change the face of celebrities on people in pornographic video clips. The term, which applies to both technologies and videos created with, is a model of deep learning and false.
The Deepfake video is created using two competing AI systems: one is called the generator and the other is the discriminator. Basically, the generator creates a fake video clip and then asks the discriminator to determine whether the clip is real or false.
Whenever the discriminator accurately identifies a video clip as false, it gives the generator a hint about what not to do when creating the next clip.
Together, the generator and the discriminator form a network called Generative Accident Network (GAN). The first step in establishing a GAN is to identify the desired result and to create a learning dataset for the generator.
Once the generator starts to create an acceptable output level, the video clips can be passed to the discriminator.
As the generator manages to create fake video clips, the discriminator can better spot them. Conversely, as the discriminator better detects fake videos, the generator manages to create them.
AI Deepfakes Are Now As Simple As Typing What You Want Your Subject To Say
In the latest example of deepfake technology, researchers have introduced new software that uses machine learning to allow users to edit the text transcript of a video to add, delete, or modify words coming out of the video. someone’s mouth.
The work was done by scientists from Stanford University, the Max Planck Institute for Computing, Princeton University and Adobe Research. It shows that our ability to edit what people say in videos and create realistic fake becomes easier every day.
You can see below a number of examples of system output, including a revised version of a famous quote from Apocalypse Now, with the phrase “I like the smell of morning napalm”, replaced by ” I like the smell of french toast in the morning. ”
This work is just in the research stage right now and is not available as consumer software, but it will probably not be long before similar services become public. Adobe, for example, has already shared details about the software prototype called VoCo, which allows users to edit speech recordings as easily as an image, and which has been used in this search.
To create video fakes, scientists combine a number of techniques. First, they digitize the target video to isolate the phonemes spoken by the subject. (These are the constituent sounds that compose the words, such as “oo” and “fuh”.) They then associate these phonemes with the corresponding visemes, which are the facial expressions accompanying each sound.
In tests in which fake videos were shown to a group of 138 volunteers, about 60% of the participants, although the changes were real.
This may seem rather weak, but only 80% of the same group think that the unedited original film is also legitimate. (The researchers noted that this may be due to the fact that the people concerned were informed that their answers were being used for a video editing study, which means they were ready to search for counterfeits.)
As always, it’s important to remember that what this technology can do is limited.
The algorithms here only work on talk-style videos, for example, and require 40 minutes of input data. The published speech does not seem to be very different from the source material either.
The researchers then asked the subjects to record a new sound to match the changes, using the AI to generate the video. (This is because audio fakes are sometimes mediocre, though the quality improves dramatically.)
The researchers also noted that they could not yet change the mood or tone of the speaker’s voice as this would result in “strange results”. And that any occlusion of the face, for example. if someone shakes their hand while speaking, completely abandon the algorithm.
The technology is therefore not perfect, but these types of limitations are still in the early stages of research and it is almost guaranteed that they will be overcome in time. This means that society as a whole will soon have to tackle the underlying concept of this research: the arrival of software that allows anyone to edit what people say in videos without technical training.
The potential drawbacks of this technology are extremely worrying and researchers in this field are often criticized for not taking into account the potential for misuse of their work. The scientists involved in this specific project said that they took these issues into account.
In a blog article accompanying the article, they write: “Although image and video manipulation methods are as old as the media themselves, the risks of abuse are increased when they are applied to communication mode which is sometimes considered as a proof of thought and intentions. We recognize that bad actors could use such technologies to falsify personal statements and slander personalities.
But the remedy they suggest is hardly comforting. In their view, to avoid confusion, videos edited by AI should be clearly presented as such, either through a watermark or through a context (for example, an audience that understands that they are watching a video). fiction film).
But watermarks are easily removed and the loss of context is one of the characteristics of online media. Counterfeits do not have to be irreproachable to have an impact either. It only takes a few minutes of research to dispel a host of fake news articles, but that does not stop their spread, especially in communities that want to believe such lies that match their preconceptions.
Researchers note that this technology also has many benefits. This would greatly help the film and television industries, allowing them to correct badly-spoken lines without re-recording images and creating seamless dubbing of actors speaking different languages.
But these benefits seem disappointing compared to potential damage. Although it can be argued that fictional propaganda is not as much of a threat as many believe, advances in research such as this remain deeply troubling.
“New Deepfake Technology Turns A Single Photo And Audio File Into A Singing Video Portrait”
Another day, another deepfake: but this time they can sing.
A new study from Imperial College London and the Samsung AI Research Center in the UK shows how a single photo and audio file can be used to generate a singing or speaking video portrait.
Like previous deepfake programs, researchers use machine learning to generate their results. And while counterfeits are far from 100% realistic, the results are amazing given the limited data they need.
As mentioned above, this work is not entirely realistic, but it is the last illustration of the speed with which this technology evolves.
Deepfake generation techniques are becoming easier, and although such searches are not commercially available, the original deepfakers have quickly put their techniques into easy-to-use software. The same thing will surely happen with these new approaches.
Research like this naturally worries people about how they are going to be used for misinformation and propaganda – an issue that is currently baffling US lawmakers.
And while you may argue that such fears in the political arena are exaggerated, deepfakes has already caused real harm, especially to women, who have been targeted for creating embarrassing and shameful non-consensual pornography.
Getting Rasputin to sing Beyoncé is a slight relief for the moment, but we do not know how strange and terrible things could happen in the future.