Facial Animation in iClone - Text to Speech vs. FaceWare vs. LiveFace
Updated: Mar 25, 2021
Arguably one of the most difficult steps in the 3D animation pipeline (especially as a newcomer) to facial and lipsync animation. Unlike my previous outputs (using various AI learning tools) this work is tedious, time-consuming, and finicky. Additionally, facial animation can make or break the realism of a 3D avatar. Largely, this where the "uncanny valley" phenomenon comes into play. Although this process that I'm documenting isn't overly concerned with venturing into the uncanny valley (I can only expect so much of myself as a self-taught animator), it was still a feature I am extremely cognizant of and thus, needed to test out a few different softwares in order to see which one suited my work-flow best and achieve my desired output aesthetic.
As someone who is, so far, working exclusively in iClone 7, I decided to document my trial and errors with a few of their facial animation techniques (whether manual features or motion capture plug-ins available as an iClone pipeline feature). First and foremost, I want to mention that I harbored a few biases going into this assignment. Working within an extensive self-teaching model, I have watched numerous tutorials on all the featured mentioned below and by and large, the LiveFace plug-in (mocap feature which uses Apple's TrueDepth Camera) came most recommended. So that was my bias upon exploring these different techniques, however, I tried to remain neutral in my critiques of each feature. In order to accomplish this, I decided to take a different approach to my documentation methodology for this post - opting instead to live stream and record my reactions to each technique in real time as opposed to merely reflecting on the process after the fact. I had already watched tutorials on how to set-up and use each feature, but the videos below demonstrate a sort of "stream of consciousness" reaction to each facial animation feature. I will also include some writing for context, after-the-fact thoughts, and reflections on the affordances of each technique.
1) TEXT TO SPEECH (included with iClone 7, no additional cost).
The text to speech feature for facial lipsync/animation, though quite rudimentary, produces fairly consistent outputs. Of course, the "speech" voice leaves much to be desired, as does the lack of facial expressiveness. However, as I state in the video, there is always the opportunity to re-record dialogue after the fact and sync the animation afterward (it would probably take a lot of tedious work - but it is possible with iClone's animation timeline feature). To combat the lack of expressiveness in the avatar's face, you can also use the face puppet and/or face key features to animate. Again, this is tedious and not as simple as a motion capture feature but would give the animator a great deal of control over the final output. For my work in 3D animated documentary filmmaking (using reference footage as the basis for my animations) this pipeline for facial animation would be even MORE challenging but as I note in the video I do the affordances of the text to speech feature for anyone working in narrative, virtual film production. For a FREE feature (included in your purchase of iClone 7) this is quite a valuable tool for anyone looking to do facial animation, especially when combined with manual facial puppeteering or manual face muscle keying.
Expression keying features
Lip editor for phonic lip positions
2) Faceware Mocap Plug-In (iClone 7 + webcam, $1,090.00 USD for Faceware app and $99.00 USD for MotionLive app required)
Faceware, unlike the aforementioned text to speech function, is a plug-in hosted by Real Illusion that needs to be purchased separately from the main iClone software ($1,090.00 USD for Faceware and $99.00 USD for the required Motion Live app that links the mocap to your iClone project). Faceware is a facial motion capture program that uses data tracking off of a webcam, GoPro or ProHD camera in order to animate 3D characters. The other added benefit of Faceware is that you can synchronously record a dialogue track onto your animation timeline whilst also capturing facial data. I will say that although I found the Faceware feature to be largely underwhelming, I've seen a few video tutorials that seemingly had more success with this particular app than I did. Perhaps this is because the majority of Real Illusion's tutorials for Faceware have the animator testing out facial mocap on caricatures, rather than avatars who boast a semi-realistic aesthetic. With caricatures, there is more room to play with overexaggerated lipsync and facial animation, but unfortunately, that wasn't my desired outcome. I also found that the trackers (rudimentary - eyes, brows, nose and lips) had a difficult time following my face so if I moved too much, tilted my head, etc. the trackers would disappear instanatenously. As I mentioned in the above video, there is always room to clean up the mocap recording after the fact, using iClone's face puppet and face key features (i.e. manipulate mouth movement, expressions, and facial muscles), but I was unable to test this feature because I was only using the trial version of Faceware (free for iClone users for 30 days, no recording privileges, only preview access). I do appreciate the fact that iClone lets users trial their more expensive apps and, in this particular case, this allowed me to feel fully resolute in the fact I would not purchase this feature for my animation pipeline. Especially not with the feature's current price tag.
Self-portrait: being dubious of Faceware's simplicity + lack of tracking points
3) Liveface Profile (iClone 7 + any iPhone/iPad enabled by TrueDepth camera for facial recognition, $299.00 USD for Liveface plug in + $99.00 USD for Motion Live plugin)
Overwhelmingly, the Liveface plug-in is the most recommended app for iClone facial mocap. I dug through numerous forums, watched countless YouTube reviews, and even posed my own questions on iClone Facebook groups; the general consensus was that Liveface is the far superior mocap feature (as opposed to Faceware). Now, that being said, I would argue that although it was a big improvement, this application is not as accessible. In order to use the Liveface feature, you need an iPhone X or higher OR any iPad enabled by TrueDepth camera: iPad Pro 11-in. (1st and 2nd generation) and iPad Pro 12.9-in. (3rd and 4th generation). Although the iPhone X is now 3 years old (ancient by some tech standards), it is quite expensive, or, more expensive than this plug-in at least. If you are like me, you hold on to your old iPhone until all the life is drained from it. Currently, I own an iPhone 8 and I only purchased this phone a little over a year ago. My phone works great - I don't see why I would want to update it because I just don't care that much! However, that became a predicament when I was that, by and large, Liveface was the superior motion capture plugin. Full disclosure - I still don't own an iPhone X or above, but Ryerson University was kind enough to purchase the iPhone 11 so that I could use it for this project, and then the school could lend the phone out to future students who are interested in using its TrueDepth features. So, a big shout out to Ryerson University.
Unlike Faceware, Liveface maps a full mesh of your face for superior tracking. This is done through the external Liveface App (Apple). Liveface uses your WiFi's IP address in order to sync the tracking happening in the iPhone app with your iClone/avatar project. I took a short video (recorded on my iPhone 8) to demonstrate exactly what the app interface/facial tracking looks like:
As you can see, the facial mocap data being tracked by Liveface is far smoother, cleaner and less distorted compared to Faceware. And of course, some of this animation will likely have to be cleaned up/refined after the fact, but as a base animation, the Liveface feature was impressive enough for me to decide to buy the full version (total $399.00 USD). This way, I can actually record my facial animation and then edit as needed in iClone's timeline (a feature not available in the trial version). Moving forward with this section of my assignment, I will only be using a mix of Liveface and text to speech features to recreate my reference footage as 3D animation.
Next up in my process - movement animation (manual animation attempt vs. Rokoko Mocap attempt).