Lip -Sync Options

Author	Message
Kimmie777	Kimmie777 Posted 6 Years Ago
Distinguished Member Group: Forum Members Last Active: Last Year Posts: 569, Visits: 1.1K	Hey Guys! Looking ahead to the future when we do our next video clip, we need to be looking at all of our lip-sync program options. What all do you guys recommend? I'm sure iClone is fine for a single sentence. But when we have 20+ second audio files of someone speaking, stopping the timeline is like stopping a freight train. It can never stop at the place I press stop on it. It just keeps going for a bit before it finally stops for me. Then I have to go back and find the exact place I wanted it to stop at. It also does not have any options to temporarily slow down the speed on the audio in order to watch everything step-by-step. I am not going to temporarily stretch out the audio in order to slow it down. That is too risky and far too high maintenance for the amount of audio clips that we will be working with. If we don't get it back into place right on the dot, even a 3% difference in audio speed will sound 'off'. Not going there on this production. My first lip-syncing project working on just iClone's lip-syncing program took about half of a day. That cannot happen again. We need a lip-syncing program that meets these needs that I mentioned above. Suggestions, anyone? Thanks! :-) ~ Kimmie www.TrackingJESUS.net --> More Information On This Bible Film Production Can Be Found At: www.TrackingJESUS.net/Productions --> Become An Early Subscriber To The 'Tracking JESUS' YouTube Channel At: https://www.youtube.com/channel/UCEMx91AySc3QuF-SS_3mGQQ

animagic	animagic Posted 6 Years Ago
Distinguished Member Group: Forum Members Last Active: 3 days ago Posts: 15.8K, Visits: 31.4K	Are you aware that you can scrub the timeline and hear the audio? That's what I use when editing visemes. The procedure I use is to record the audio outside of iClone )I use Audacity to record) and then have an audio file with the speech. I always do the timing before hand so that the audio includes any pauses. So if I have several people speaking, I give each a separate track with pauses included where someone else is speaking. I export from Audacity as one audio file for each speaker, which is then loaded into iClone for the lip-syncing. Now we all agree that the viseme assignment within Clone is not ideal and needs editing and that is labor-intensive. There is a Python script that Mike Kelly wrote that can use a list of visemes generated in a program called Papagayo. Papagayo works better because it uses the actual dialog text to improve the viseme assignment. Something similar could be built into iClone and is already there when you use TTS voices, but thus far this idea has fallen on deaf ears (pun intended) on RL's part. Back to Papagayo. Mike has written a script that imports the Papagayo visemes into iClone, but the drawback is that you then no longer have the speech file available with the visible waveform. I liked Mike's idea and enhanced it so that the visemes are inserted while keeping the speech waveform. Unfortunately, the latest Python API has a bug and that script no longer works...:crying: The other alternative is to invest in facial Mocap, which is fairly expensive and may not be practical if you use external voice talent.

Kimmie777	Kimmie777 Posted 6 Years Ago
Distinguished Member Group: Forum Members Last Active: Last Year Posts: 569, Visits: 1.1K	We do the same thing. We also use Audacity (LOVE that program!) and record the voices in there. It sounds like we are both doing the exact same thing. Yes, Papagayo... Mike Kelly told me about that recently and he sent me a link to a tutorial he did on it. :-) (Mike is awesome! :-) - And he is quite the Jedi Master at fixing things that iClone falls very short in.) The Papagayo would solve part of the issues for sure. I am really sad to hear that will no longer work. I hope it gets fixed. I am also hoping to be able to work from a timeline that does not have the same stopping speed that a freight train does. :-P ...And the ability to do a play-by-play play-back would be really nice too. Even half-speed play-back abilities would be an asset. And yes, Animagic, I agree... The lip-sync is labor intensive indeed. And it doesn't have to be. Just a few simple changes to that program could make it really effective. Is there another lip-sync program out there that can work as a stand-alone and then be able to port the finished lip-sync files into iClone by chance? Reallusion looks like it is good for simple stuff. But after all that we have invested into this, it is looking like iClone just can't handle a project like this except with a lot of unnecessary labor on our end. Can lip-sync files even be imported into iClone that were done in another program? (The file type issues would be one obstacle in that, of course.) Or is this is as effective as it gets here? ~ Kimmie www.TrackingJESUS.net --> More Information On This Bible Film Production Can Be Found At: www.TrackingJESUS.net/Productions --> Become An Early Subscriber To The 'Tracking JESUS' YouTube Channel At: https://www.youtube.com/channel/UCEMx91AySc3QuF-SS_3mGQQ

Kelleytoons	Kelleytoons Posted 6 Years Ago
Distinguished Member Group: Forum Members Last Active: Last Year Posts: 9.2K, Visits: 22.1K	Kimmie, My script still works. Job did some changes that made it a bit easier (with mine you will need to do a bit more work upfront but my newest version even fixes that and I used a method that still works, unlike what Job did). It is by far the best solution for now until you go to mocap. You can use mocap for pre-recorded audio -- it's what I did for the sample I sent to you. What I do in those cases is just mocap over the original actor's voice as if I were doing lip syncing in a music contest. I also slow down the audio by 50% as that helps greatly. It's not hard and VERY quick (I also did it for the old folks video that was excerpted from our animated weekly series for Fox if you wanted to see the results). But it will cost $$$, mostly if you don't already have an iPhone X/R (the cheapest one you can get will cost around $500). If you have one already (or know someone who has one you can borrow) then all you need is the $400 or so mocap software from Reallusion. I would highly recommend going this route as soon as you can afford it. Without mocap it typical took me around a week to sync up our 22 minute show (a dozen or so characters and mostly dialog driven). But I'm pretty good at Papagayo and we're talking a 40 hour week (not just "in my spare time"). So, if you could only do 4 hours a day it would take 10 days, 2 hours three weeks, etc. I honestly do not believe there is any way you can do better using any other manual tool. Using mocap you can do the same 22 minutes sync in a few hours (with perhaps an hour or two of cleanup if you are really picky -- look at the mocap I sent you to see what I did in just about a minute with John). Again, I don't think you can do better than that with any other mocap system. Alienware Aurora R16, Win 11, i9-149000KF, 3.20GHz CPU, 64GB RAM, RTX 4090 (24GB), Samsung 870 Pro 8TB, Gen3 MVNe M-2 SSD, 4TBx2, 39" Alienware Widescreen Monitor Mike "ex-genius" Kelley

AutoDidact	AutoDidact Posted 6 Years Ago
Distinguished Member Group: Forum Members Last Active: 2 Months Ago Posts: 2.1K, Visits: 13.6K	@Kimme, Also (With notable exceptions Like Papagayo) AFAIK there are no other external Audio based Lipsinc solutions that will by default produce phonemes for Iclone native Avatars. Most of the audio based ones seem to be rather program specific, even Game engines have their own now. Kellytoons has quite definitively laid out the reality of today's state of affairs, (for Iclone users at least), on this aspect of filmmaking. Reallusion has actually addressed the frankly abysmal performance of the default lipsinc with their $$facial mocap$$ option ,which is quite good IMHO , so no new audio based options will likely be offered by RL going forward. Glad to hear Papagayo Still works here though. RAG DOLL COLLISION ANIMATIONS FOR ICLONE 8 & 7 --------------------------------------------------------------------------------------------------------------------- Ghost Origins My latest Feature length film created with Iclone. My Sci- Fi Graphic Novel on Amazon: https://a.co/d/9k3cwoY My IMDB listing Anabran Abdhul Quarnain - IMDb

Kelleytoons	Kelleytoons Posted 6 Years Ago
Distinguished Member Group: Forum Members Last Active: Last Year Posts: 9.2K, Visits: 22.1K	My Papagayo script now allows you to pick the audio and data files, so it's a lot easier to use (no editing of the script is necessary). Here's a link: https://www.dropbox.com/s/g6kv4qynxo2bcv8/Pagayo2.py?dl=0 Alienware Aurora R16, Win 11, i9-149000KF, 3.20GHz CPU, 64GB RAM, RTX 4090 (24GB), Samsung 870 Pro 8TB, Gen3 MVNe M-2 SSD, 4TBx2, 39" Alienware Widescreen Monitor Mike "ex-genius" Kelley

Kimmie777	Kimmie777 Posted 6 Years Ago
Distinguished Member Group: Forum Members Last Active: Last Year Posts: 569, Visits: 1.1K	KelleyToons, I would LOVE to see your show!! :-) And now that I am not cramming my brain with learning and applying all of the different technical stations of our project, this would be a perfect time to see your shows, please! :-) As a matter of fact, I would love to see all of your guys' projects now that we have a little bit of down-time. :-) (Well, anything that isn't icky, etc.) Animagic: I forgot to mention that all of the audio is being recorded externally. So yeah, facial mocap by itself would not be practical. As a matter of fact, the audio clips are being recorded 'from sea to shining sea' on this production (with matching mic sets and using Audacity to clean up any white noise on them). AutoDiadect: For me personally, it is only partly about the phonemes... The other part is the lack of ability to slow it down or have it stop right on command. Even if the phonemes were completely accurate, there is still editing to be done in expressiveness levels and expression types - and also in slight head turns and such. But yes, making the phonemes easier to find is definitely a problem-solver since iClone's timeline is not 'stop-right-here-now-that-I-found-it' friendly. KelleyToons' Papagayo program will be quite valuable for that. I suppose the mocap system and some practice with it, the rest of the issues could be handled in there. Even a few 'takes' it looks like would be more time efficient than what it is now. KelleyToons: No iphones here, but as I just mentioned to Animagic in this same reply box, none of our audio clips are being done at the animation computer so far. Even on our local recordings we make house calls with the mic set and a laptop containing Audacity. So I guess the only solution on these things is mocap whenever we are able to do that - mocap combined with Papagayo, that is. :-) Excellent feedback from all of you! - Thank you guys so much! :-) Peace, ~ Kimmie www.TrackingJESUS.net --> More Information On This Bible Film Production Can Be Found At: www.TrackingJESUS.net/Productions --> Become An Early Subscriber To The 'Tracking JESUS' YouTube Channel At: https://www.youtube.com/channel/UCEMx91AySc3QuF-SS_3mGQQ

Kelleytoons	Kelleytoons Posted 6 Years Ago
Distinguished Member Group: Forum Members Last Active: Last Year Posts: 9.2K, Visits: 22.1K	Kimmie, What I was saying about mocap is that you CAN use pre-recorded audio to do mocap with. You just "lip sync" to the pre-recorded audio. I think what I'll do is create a tutorial on this process because I think others would like to see it. If I think of it I'll try and link to it here after I'm done (maybe later today but for sure tomorrow). Alienware Aurora R16, Win 11, i9-149000KF, 3.20GHz CPU, 64GB RAM, RTX 4090 (24GB), Samsung 870 Pro 8TB, Gen3 MVNe M-2 SSD, 4TBx2, 39" Alienware Widescreen Monitor Mike "ex-genius" Kelley

Kimmie777	Kimmie777 Posted 6 Years Ago
Distinguished Member Group: Forum Members Last Active: Last Year Posts: 569, Visits: 1.1K	Yes, I got that. :-) I may not have worded it very clearly. From all of the combined feedback here, it sounds like it takes the whole mocap system to be able to do that, but facial mocap alone it would need to be done internally. Tell me if I got that right or not. Tanx and looking very much forward to your tutorial on that... I am sure it will help lots of people. :-) ~ Kimmie www.TrackingJESUS.net --> More Information On This Bible Film Production Can Be Found At: www.TrackingJESUS.net/Productions --> Become An Early Subscriber To The 'Tracking JESUS' YouTube Channel At: https://www.youtube.com/channel/UCEMx91AySc3QuF-SS_3mGQQ

Kelleytoons	Kelleytoons Posted 6 Years Ago
Distinguished Member Group: Forum Members Last Active: Last Year Posts: 9.2K, Visits: 22.1K	Now you've got me a bit confused. If you mean there are distinct differences between the viseme approach and using facial mocap, though, you are correct. While it's true you can turn on visemes for additional capture while you do facial mocap, all that really happens is it affects the tongue. But in either case you can edit the muscles to get the look you want. Let's see if I can explain that properly -- you use your pre-recorded audio to capture lip sync AND emotions but without visemes (which you don't really need). Then later you can tweak those things by using the emotion presets and/or individual muscle adjustments -- the same way you'd add expressiveness with visemes only. Or the proof is in the pudding -- here's my short where I did all the lip sync using the pre-recorded audio from our actors (IOW, all I had was the audio file as some of the actors were even not with us anymore): Alienware Aurora R16, Win 11, i9-149000KF, 3.20GHz CPU, 64GB RAM, RTX 4090 (24GB), Samsung 870 Pro 8TB, Gen3 MVNe M-2 SSD, 4TBx2, 39" Alienware Widescreen Monitor Mike "ex-genius" Kelley

Lip -Sync Options

Lip -Sync Options

Reading This Topic