(Just for fun) DeepFake Output (Avatar/Photorealistic Footage)
This wasn't part of the plan. However, over the weekend I received an email stating that a DeepFake generator that I applied to beta test had gone live. DeepFake Web, although not currently in my slate of tech to use for my MRP, piqued my curiosity.
As per my Major Research Project, I was interested in figuring out the best way of integrating photorealistic footage with a 3D animated likeness. My very first idea (since abandoned) was to output DeepFakes, which was one of the major reasons I decided to purchase a custom-built PC to use in lieu of my late 2017 iMac. But I didn't think this idea all the way through. DeepFakes are time-consuming, tedious, and oftentimes expensive. If you don't have the graphic power - they will fail. If you don't have the memory, you'll have to purchase Cloud Usage. Not to mention compiling the training data set that is typically comprised of hundreds of hours of footage and accompanying images. But to be completely honest, I was willing to put in the work, time, energy, and money.
Realistically, the element I didn't think through was the fact that I wanted my driver video (i.e. input, base) to be an animated Avatar and the target (superimposed face) to be real footage. However, creating a DeepFake this way wouldn't take care of facial animation and lipsync unless the animated driver footage already had animation already! So, in this case, what is the point? The output would be a 3D animated body and the face would be semi-photorealistic but stationary, taking on the lack of expression, movement, and facial movement of the avatar. I decided that if I had to impose facial animation and expression mapping on my avatar already, I may as well just use that as my final output and completely forgo any integration with photorealistic footage.
However, upon receiving the email that DeepFake Web had gone live, my curiosity was too powerful to simply ignore it. So, much like other steps I've chronicled in this blog, I decided to simply play around and see what happened considering I had to use the photorealistic footage as the driver and my avatar as a target (the opposite of what I wanted).
Unlike other DeepFake software like DeepFaceLab, the training sets you need to provide the site are extremely simple. One video for a driver and one video for a target and instructions are provided for what type of video you should be using (i.e. front-facing, cropping dimensions, etc.). Unfortunately, there is a limit of 150 seconds per output, so if you're hoping to make a longer video you'll need to do your own editing. Another caveat is that you are required to purchase Cloud Usage for the output to be generated. I purchased 600 Cloud Usage minutes for $20.00, thinking maybe I'd use the software more than once. Estimated Cloud Usage minutes per output is 240-360 minutes.
So, I went ahead and uploaded my photorealistic footage (as seen in previous posts) and a video of my Jamie avatar doing basic animation (mouth open, mouth close). I recorded this basic animation very quickly in iClone 7 (close up). This animation was done using iClone's LiveFace plugin which I'll write about more extensively in my next post. To enhance my output, I also uploaded a set of training images showing my avatar emoting various expressions as well. You can upload up to 1500 images which, I have heard, is a standard training set number.
Training video (target)
Training set (images)
After I hit "create my video" I was off to the races (or the marathon considering how long it took to output). I had to wait nearly 4 hours until the final DeepFake was created. Once again, in this case, my driver had to the photorealistic footage and the target was the avatar. In simpler terms, this DeepFake was taking my avatar's face and placing it onto my photorealistic body. And was it worth $20.00 and a 4-hour wait? I'll let you decide for yourself...
A nightmare avatar/human hybrid. This, as I presumed, did not work. But perhaps it will serve as inspiration for my next experimental art film.
All in all, this was nothing more than an experiment. I keep finding new, interesting ways of further refining my methodology, which is a great asset to my process sooner than later. I'm glad I'm toying around with some of these external facial animation/lipsync tools now rather than when I'm in the thick of my MRP development. I still have 412 minutes of Cloud Usage on my DeepFake Web Account so perhaps I'll try it again once I have my avatar animation finished. That way, I can use my lively avatar as the driver as I initially intended. Or I'll make a controversial video that will cause society to descent into absolute chaos because no one will able to distinguish fact from fiction. If only I could get rid of the mandatory watermark "This is a FAKE video" the software automatically stamps on your output.... hmm....