That clip is terrifying. And not just because gay panic is still being used for laughs. Imagine if you could type whatever you wanted the President of the United States to say and he would say it. You could wield the power of Steve Bannon without looking like a sad muppet who got locked in a wind tunnel.
Adobe claims that it has roust methods for watermarking audio generated by Voco. The idea behind that is to put an inaudible digital signature on any files manipulated by Voco so that people could tell doctored audio from the genuine article. Then again, as I write this, the current version Adobe Photoshop is the most pirated program on The Pirate Bay. So claims that they can secure their software should be taken with a grain of salt.
People might get around their watermarks by copying their doctored audio using lossy methods, like copying to and from a cassette tape. Or Voco might just be hacked and reverse-engineered faster than you can falsify Adobe CEO Shantanu Narayen saying, "I told you so." And thanks to Voco's simple interface, that will be pretty damn fast.
We're about to enter an age when hearing someone's voice say something does very little to prove they actually said it. So what can we rely on instead? Do we need a video of someone clearly and distinctly saying something directly into the camera? Funny I should mention that...
Well, not "ha ha" funny but rather "oh Jesus, we're all fucked" funny. Face2Face is a new tech that allows you to take existing footage of a person and puppeteer their image with your own movements. You can take basically anyone who's been on CNN and create a new video where they ape whatever you're doing. That is to say, you could wield the power of anyone who actually breaks news.
So, in the video linked above, you'll see people in a lab puppeteering Putin and Trump pretty darn convincingly. This is still a relatively new technique so there are a few weird artifacts, but we aren't far off from video of a person that is basically created whole cloth. I don't think I need to spell out what could be done with a combination of Adobe Voco and Face2Face: Someone could have made a good Grand Moff Tarkin in Rogue One.
But that also means this technology will soon be used for less virtuous goals. Someone is going to use these technologies to tank a political candidate. Someone is going to use them to try to incite war. Someone might even use them to fake a video where Bill Murray shows up to somebody's birthday party. We just don't know. If we can't trust audio and we can't trust video how do we continue to have news?
Before we burn Atlanta to the ground screaming, "Technology has killed truth!" keep in mind that as long as there has been evidence, people have been faking evidence. Doctoring photos was a convincing art long before Photoshop came around. There's even a famous doctored photo of Lincoln, where his head was placed on a portrait of Southern leader John Calhoun ...
Library of Congress