Profile Picture

Lip Sync Improvement...Any Tips or Suggestions

Posted By TopOneTone 6 Years Ago
You don't have permission to rate!
Author
Message
TopOneTone
TopOneTone
Posted 6 Years Ago
View Quick Profile
Distinguished Member

Distinguished Member (3.6K reputation)Distinguished Member (3.6K reputation)Distinguished Member (3.6K reputation)Distinguished Member (3.6K reputation)Distinguished Member (3.6K reputation)Distinguished Member (3.6K reputation)Distinguished Member (3.6K reputation)Distinguished Member (3.6K reputation)Distinguished Member (3.6K reputation)

Group: Forum Members
Last Active: Last Year
Posts: 329, Visits: 3.2K
https://www.youtube.com/watch?v=CFGGJoJjvyk

This is my latest addition to the Chuck Chunder series. Really enjoying working on these and finding it has been a really good way to force me into learning how to use iclone features and plug-ins that previously I was not utilising very often. 
The big issue I am very conscious of at the moment is the lip sync. You can see from this video, that despite putting a lot of time into lining up the phonemes and adjusting mouth openings its still far from perfect. Recently, I had a project for a client that involved two characters in a 3 minute discussion and it took me nearly a week to correct the lip sync and even then I still felt it wasn't perfect. I'll confess lip sync has not been something that I have paid a lot of attention to in the past, but I'm finding I'm spending a lot of time on it now and am very conscious of its impact.

I have a couple of theories that I'd like to throw out there for comment and hopefully stimulate some discussion on techniques for improving lip sync and maybe Kai or someone could make it the focus of an advanced tutorial.
1) I'm finding the sound is marginally ahead of the animation ie there seems to be a very slight delay between the voice and the lip movement. I guess in my non-techie way I could rationalise that in an automated system that will be the case, but I'm very conscious of having to regularly adjust for this.
2) This delay seems to vary across the sound track, so sliding the sound track along the time line to correct the first part of the speech may create a problem further on. I often end up splicing up the sound file into smaller portions just to deal with this.
3) I often have to make changes to the phonemes and I guess this could be influenced by the tone, volume and speed of the voice and I guess these could also impact on the above.
4) The weirdest thing I am regularly encountering, is that I am convinced that the render process has an impact on the lip sync outcome. There are times when I can see a clear difference between the lip sync I am observing on screen in iclone and the rendered video. This becomes apparent in the process I use for creating the Chuck Chunder videos. I have a pre-recorded sound track, which I splice up to create the individual character voice tracks for the lip sync. I then take the rendered scenes and compile them together replacing the individual voice tracks with the original complete mix sound track for the final edit, only to discover that various sections don't line up perfectly.
I'm not bagging iclone's lip sync, I'm just wanting to work out how to get a better end result than I'm currently getting. So if anyone can shed any light on what I'm encountering or offer tips or techniques to improve on my approach, I'd really appreciate it. Does anyone know if the lip sync system is likely to be developed in the future?
Cheers,
Tony 


        

 
Delerna
Delerna
Posted 6 Years Ago
View Quick Profile
Distinguished Member

Distinguished Member (8.2K reputation)Distinguished Member (8.2K reputation)Distinguished Member (8.2K reputation)Distinguished Member (8.2K reputation)Distinguished Member (8.2K reputation)Distinguished Member (8.2K reputation)Distinguished Member (8.2K reputation)Distinguished Member (8.2K reputation)Distinguished Member (8.2K reputation)

Group: Forum Members
Last Active: 2 Years Ago
Posts: 1.5K, Visits: 14.8K
I really don't know the answer but I wonder if doing something like I have heard relating to lip syncing singing might help? Might make it harder too ??????

Putting a voice singing on the character so he/she lip sinks to the song doesn't work very well. I heard some comments from others stating the same thing and recommending that you do the singing as a background sound and recording yourself talking the song rather than singing it to control the lip syncing better. I haven't tried it myself yet so I can't say it works but it sounds reasonable. I guess it might even help you to time the lip syncing with the actual voice. I figure you can speak the words in a way that helps the lip syncing better even though it might not sound good.The actual speaking will be in the background so it sounds good but doesn't effect the lip syncing. Whether that will be too hard to do or not I don't know. It's just what I have heard from some other people.


i7-3770 3.4GHz CPU 16 GB Ram   
GeForce GTX1080 TI 11GB
Windows 10 Pro 64bit
Edited
6 Years Ago by Delerna
Delerna
Delerna
Posted 6 Years Ago
View Quick Profile
Distinguished Member

Distinguished Member (8.2K reputation)Distinguished Member (8.2K reputation)Distinguished Member (8.2K reputation)Distinguished Member (8.2K reputation)Distinguished Member (8.2K reputation)Distinguished Member (8.2K reputation)Distinguished Member (8.2K reputation)Distinguished Member (8.2K reputation)Distinguished Member (8.2K reputation)

Group: Forum Members
Last Active: 2 Years Ago
Posts: 1.5K, Visits: 14.8K
Mind you, I just finished watching it and I really enjoyed it.
Yes I saw the lip sinking you are talking about but really I only took notice of it because you mentioned its issue.

If it wasn't for you saying that I wouldn't have taken much notice of it and still enjoy watching it. Maybe its just me but I love cartoon styles and don't expect to see perfection.
I'm not trying to say you shouldn't try to improve it. Just that I totally enjoyed it and the lip syncing didn't effect my enjoyment of it at all.


i7-3770 3.4GHz CPU 16 GB Ram   
GeForce GTX1080 TI 11GB
Windows 10 Pro 64bit
Kelleytoons
Kelleytoons
Posted 6 Years Ago
View Quick Profile
Distinguished Member

Distinguished Member (35.6K reputation)Distinguished Member (35.6K reputation)Distinguished Member (35.6K reputation)Distinguished Member (35.6K reputation)Distinguished Member (35.6K reputation)Distinguished Member (35.6K reputation)Distinguished Member (35.6K reputation)Distinguished Member (35.6K reputation)Distinguished Member (35.6K reputation)

Group: Forum Members
Last Active: Yesterday
Posts: 9.1K, Visits: 21.8K
This might be a *touch* OT, but for really perfect lip sync there isn't any better program I've ever found (and I've used things costing $$$ -- even twice the cost of Faceware, which you'd think would be perfect for lip sync) than the freeware Papagayo.  Of course, there's a catch -- it isn't designed for iClone.

What it does is analyze an audio track and generate the phonemes you need, but then allow you to easily (even a 10 minute scene takes but a minute or so) adjust them so they are perfect.  It's designed for the Moho 2D animation program, exporting an ASCII file that that program loads up automatically.   I've used it for broadcast television work and never had any complaints.  Ah, but then how do you apply this ASCII file to iClone?

I was SO hoping that when the Python (add-on?  feature?) came out we'd be able to program this, but since it appears like they are basically crippling any abilities it will have that clearly won't be possible (because if it was I'd have it done within a day or two -- it would be my highest priority).   Failing that you could use it to manually adjust things (a PITA, but it *would* be accurate and it's basically what we had to do for broadcast work back in the day with very expensive track generators).  At the very least everyone ought to download Papagayo just in case someday it goes away.

It's my fondest hope I live long enough for this to be automatically possible in iClone (I have SO many audio tracks recorded by professionals I need to animate -- would be amazing).



Alienware Aurora R12, Win 10, i9-119000KF, 3.5GHz CPU, 128GB RAM, RTX 3090 (24GB), Samsung 960 Pro 4TB M-2 SSD, TB+ Disk space
Mike "ex-genius" Kelley
TopOneTone
TopOneTone
Posted 6 Years Ago
View Quick Profile
Distinguished Member

Distinguished Member (3.6K reputation)Distinguished Member (3.6K reputation)Distinguished Member (3.6K reputation)Distinguished Member (3.6K reputation)Distinguished Member (3.6K reputation)Distinguished Member (3.6K reputation)Distinguished Member (3.6K reputation)Distinguished Member (3.6K reputation)Distinguished Member (3.6K reputation)

Group: Forum Members
Last Active: Last Year
Posts: 329, Visits: 3.2K
Thanks Mate. Interesting suggestion I'll have to try that out and see if it works.
Glad it didn't get in the way of your enjoyment.
Cheers,
Tony

https://forum.reallusion.com/uploads/images/84bd6a7b-1440-4f33-849f-d5ce.jpg
Plenty more episodes of Chuck Chunder at :
youtube : https://www.youtube.com/channel/UCGRNWyCMD7iPr4DqUNnHwgQ
facebook :https://www.facebook.com/Chundertime/
Delerna
Delerna
Posted 6 Years Ago
View Quick Profile
Distinguished Member

Distinguished Member (8.2K reputation)Distinguished Member (8.2K reputation)Distinguished Member (8.2K reputation)Distinguished Member (8.2K reputation)Distinguished Member (8.2K reputation)Distinguished Member (8.2K reputation)Distinguished Member (8.2K reputation)Distinguished Member (8.2K reputation)Distinguished Member (8.2K reputation)

Group: Forum Members
Last Active: 2 Years Ago
Posts: 1.5K, Visits: 14.8K
I would like to know if it does. At the moment I am working on creating a lot of props for my next harry potter video and soon I will be starting to make my video and I am sure I will be needing to work on my lip syncing too. Knowing how that works will be helpful.

i7-3770 3.4GHz CPU 16 GB Ram   
GeForce GTX1080 TI 11GB
Windows 10 Pro 64bit
Zeronimo
Zeronimo
Posted 6 Years Ago
View Quick Profile
Distinguished Member

Distinguished Member (3.8K reputation)Distinguished Member (3.8K reputation)Distinguished Member (3.8K reputation)Distinguished Member (3.8K reputation)Distinguished Member (3.8K reputation)Distinguished Member (3.8K reputation)Distinguished Member (3.8K reputation)Distinguished Member (3.8K reputation)Distinguished Member (3.8K reputation)

Group: Forum Members
Last Active: 8 hours ago
Posts: 478, Visits: 19.1K
I do not know what you think about it, but I think there is also a technical problem: iClone works at the speed of 60 frames per second and thus generates the labial movements at this speed, but when render at 30 fps we lose 50% of the frames.
This is especially visible when there are explosive syllables ('pa' 'pe') that are very fast lip movements.
Perhaps it should be rendered at 60 fps without compression when the movie contains dialogues.
Kelleytoons
Kelleytoons
Posted 6 Years Ago
View Quick Profile
Distinguished Member

Distinguished Member (35.6K reputation)Distinguished Member (35.6K reputation)Distinguished Member (35.6K reputation)Distinguished Member (35.6K reputation)Distinguished Member (35.6K reputation)Distinguished Member (35.6K reputation)Distinguished Member (35.6K reputation)Distinguished Member (35.6K reputation)Distinguished Member (35.6K reputation)

Group: Forum Members
Last Active: Yesterday
Posts: 9.1K, Visits: 21.8K
Actually less frames per second works better with fast dialog -- so the more explosive the sounds, the fewer the frames you need (most animation is done at 24 fps and looks terrific).

Having more frames just exposes the "wrongness" of the sync.  Try rendering at 24fps and you might get a better match.



Alienware Aurora R12, Win 10, i9-119000KF, 3.5GHz CPU, 128GB RAM, RTX 3090 (24GB), Samsung 960 Pro 4TB M-2 SSD, TB+ Disk space
Mike "ex-genius" Kelley
Zeronimo
Zeronimo
Posted 6 Years Ago
View Quick Profile
Distinguished Member

Distinguished Member (3.8K reputation)Distinguished Member (3.8K reputation)Distinguished Member (3.8K reputation)Distinguished Member (3.8K reputation)Distinguished Member (3.8K reputation)Distinguished Member (3.8K reputation)Distinguished Member (3.8K reputation)Distinguished Member (3.8K reputation)Distinguished Member (3.8K reputation)

Group: Forum Members
Last Active: 8 hours ago
Posts: 478, Visits: 19.1K
Kelleytoons (7/10/2018)
Actually less frames per second works better with fast dialog -- so the more explosive the sounds, the fewer the frames you need (most animation is done at 24 fps and looks terrific).

Having more frames just exposes the "wrongness" of the sync.  Try rendering at 24fps and you might get a better match.



I would have thought that more frames per second would have given a better result, but if you say otherwise I think you tried and you must be right.
I'll have to try the next time.
Kelleytoons
Kelleytoons
Posted 6 Years Ago
View Quick Profile
Distinguished Member

Distinguished Member (35.6K reputation)Distinguished Member (35.6K reputation)Distinguished Member (35.6K reputation)Distinguished Member (35.6K reputation)Distinguished Member (35.6K reputation)Distinguished Member (35.6K reputation)Distinguished Member (35.6K reputation)Distinguished Member (35.6K reputation)Distinguished Member (35.6K reputation)

Group: Forum Members
Last Active: Yesterday
Posts: 9.1K, Visits: 21.8K
It may work better for slow speech (or for *very* precise, slow movements of the face) but yes, far fewer frames works better with things like explosives.

Think of it this way -- if you want to see a "flash" (a "real" explosive :>Wink you really only want a frame or two at the most.  More frames, even if in Real Life it might be that way, doesn't convey the speed.  While things have changed quite a bit (used to be all films were done at 24fps, but with digital we are seeing more and more done at higher frame rates) it's still important for very quick things to be more of an impression than spelled out.  So things like "p's" and "f's" will "read" better with just the smallest of frames -- the eye is fooled (remember, we are just fooling the eye anyway, due to persistence of vision).

And even Faceware reads better with less frames -- the guy that does all the tutorials here only does them at 30fps (he doesn't even use a webcam capable of higher rates).  I've seen this firsthand myself and have now stepped back to that from the 60fps I was using.



Alienware Aurora R12, Win 10, i9-119000KF, 3.5GHz CPU, 128GB RAM, RTX 3090 (24GB), Samsung 960 Pro 4TB M-2 SSD, TB+ Disk space
Mike "ex-genius" Kelley



Reading This Topic