Mireille Raad's blog

home / blog / ideas

Suggesting an open file format for videos created by algorithms.

I am reading the end of reality on The Atlantic. It is about how AI is creating fake videos and how that might mess up with the political system and society’s view on “reality”. I haven’t finished the article yet, but the following caught my eye:

One of deepfakes’s compatriots told Vice’s Motherboard site in January that he intends to democratize this work. He wants to refine the process, further automating it, which would allow anyone to transpose the disembodied head of a crush or an ex or a co-worker into an extant pornographic clip with just a few simple steps. No technical knowledge would be required. And because academic and commercial labs are developing even more-sophisticated tools for non-pornographic purposes—algorithms that map facial expressions and mimic voices with precision—the sordid fakes will soon acquire even greater verisimilitude.

So, here are some thoughts:

1/ Academia and companies creating these “even more-sophisticated” algorithms could create an “open file format” and a “dedicated file extension” for computer generated videos.

There are many extension for videos already: .mov, .webm, .ogv etc… What’s another format to add? If the algorithms that create videos could encode them in a specific format, it would be easy to tell if the video is real or not. It is like a signature. An example would be “HTTP” vs “HTTPS” for secure or non-secure internet traffic. It would be useful to create such a flag for “real” vs “generated” videos.

2/ Online video players (YouTube, Vimeo) or device installed ones (VLC, iTunes) or even html elements could automatically display a tag marking the video as “VR”. This tag could appear next to the usual video controls that we see like pause, settings and share buttons. The tag could even appear and then fade away if you don’t want to interrupt the viewer’s experience.

Alternatively or in addition to creating a file extension and format, Academia and companies could agree on adding a “diffuse” signal inside the generated files. This signal would be easy to pick up, but hard to remove. If removed, the file would be corrupted and the video wouldn’t play (easier said than done, I know! but if we are smart enough to create sophisticated AI that creates advanced virtual reality, we can crack how to do this. Research is already done on this in the field of information security, it is hardly a new problem to ensure files can’t be modified).

Another suggestion is to encode sound in a different way in a virtual reality video. Doing so, will make it harder to convert between files and it will be an added footprint.

The premise is that it will be hard for a generated video to pose as real. A tag will simply show or the reader will know from the file format.

Of course, nothing stops malicious coders from creating their own algorithms. The bet (and a very informed one) is that collectively created algorithms and modules will advance at a quicker pace than individual coders creating those AI/pornographic generators. The viewers will ultimately pick-up on the quality of a generated video trying to pose as true. It will seem and feel off.

Hacking this suggestion:

Here are some hacks that make this suggestion obsolete:

Create a converter between the file formats
Play the video in its virtual format, capture the output as regular video and then edit the file to remove anything that indicates this is a generated video.

Those hacks become like moving target. The industry standards will have to evolve to make it harder and “non-profitable” for bad actors to create fake videos that pose as real.

This suggestion is hardly a complete solution: it does not get to the social interactions that fuel the fake news economy, but adding a technical barrier could be a place to start.

Also, it will be a neat idea to have a clear signal for what is real on the internet, what is virtual and what is fake (virtual reality and fake videos should never become synonyms). Virtual reality coders and enthusiasts have every incentive not to have their code, work or hobbies confused or used by fake petty videos makers.