Synchronizing speech

Synchronizing speech

Home
»
Archive
»
iClone 6 Forum
»
Training - Tutorials, Hints and Tips

https://forum.reallusion.com/Topic235220.aspx

Print Topic | Close Window

By TimV - 11 Years Ago

Hi

As a learning project, I am at the beginning of creating a music video. The castle I posted about previously will be involved in this once I get it built in Hexagon. At any rate, here is what I have done.

There is a song that I really like by Libby McGrath called Princess. I give fair warning that it might stick in your head for a while if you search it out, but I love the concept and the lyrics.

I have a text file of the words which I have run through a text to speech program (Balabolka). I have that file on the sound line of my project in animation, which nicely drives the facial movements so that my avatar (Heidi) will act as my singer.

I have dropped the mp3 of the song into the main window and it is working fine as the background music. So, I have the audio of the text file and the mp3 both playing at the same time.

The next step is to synchronize the spoken audio with the voice of the mp3. The timing is worlds apart right now, so I am wondering if there is a better approach than just trying to speed up the spoken word delivery.

Is there a way to perhaps use the text of the song from the full mp3 (music and voice) to drive the lip-syncing instead of a separate text file?

High level suggestions would be great. I can figure out the process with a little prodding on the overall direction.

Thanks,

Tim

By prabhatM - 11 Years Ago

If you think the Text to Speech has given you good lip syncing, then render that to Image Sequence / video without audio.

In a video editing programme, mix with the original song.

By Cricky - 11 Years Ago

TimV (5/3/2015)

Consider this:

By Rampa - 11 Years Ago

Might I suggest:

Getting the timing right may prove to be very difficult. If you have access to a microphone, I would suggest you record it as a "sing-along". It doesn't need to sound good or anything, it's just to get the timing right. Then drop out the voice as your doing now.

By prabhatM - 11 Years Ago

rampa (5/3/2015)

This is definitely a better method.

I never needed it because I always get an independent voice track for all my original recording.

By TimV - 11 Years Ago

Thank you all very much. I really appreciate the input!!!!

I have downloaded audacity and added a track for my spoken word text file. Once I figure out how to splice in spaces, I should be able to line the words up to the mp3 and then I think I saw something that would let me adjust the speed of the audio on the text file which would allow me to line up the length of time it takes to speak the work with it's actual length in the mp3.

I have a licenced version of 3DXChange 5 Pro if that is of any help

I don't have a mic at the moment, but it is a good suggestion to set the timing with a voice over personal lip sync of the song. I think I could manage to good that pretty close will a little practice.

I'm going to see if I can find any free software that will let me adjust the timing in the voice text file. I don't have a budget for "expensive" software, but I will look into lequendo to see what I should be searching for.

best regards,

Tim

By TimV - 11 Years Ago

Hi All:

My search for something similar to Lequendo came back with Balabolka as the first recommendation. I downloaded Balabolka to create the speech file that I used to drive the facial animation.

I have never used it before today, but it may allow me to do the timing adjustments I need in the text file prior to bringing the file into IClone6. That just might do the trick for me.

Time to learn a new program.....Never enough time, but I never waste an opportunity to learn something new. It always pays off in the end.

Thanks for your help in heading me down this path with your suggestions.

Best,

Tim

By Rampa - 11 Years Ago

sw00000p (5/3/2015)

"Does iClone 6 import facial animation?"
...rampa, ...or does anybody else know? :ermm:

If so, Tim would just import the animation and Drop the Sound Track!:)
This is what I do using (3dxchange 4.0) :Wow:

Kind Regards,
sw00000p :cool:

You can do what you used to do in 3DX4, that is import bone based animation that happens to be facial bones, but not characterize it. It's the getting of the animation that may prove difficult. You could try "frankensteining" a mocap head onto an iClone body.

Do you have a way to record bone based facial animation, sw00000p?

So.... Sort of.

By prabhatM - 11 Years Ago

rampa (5/3/2015)

sw00000p (5/3/2015)

Sw00000p has been claiming too often bringing the facial animation to ICLONE through 3DX4. I and the newbies would love to see a sample of that.

By Rampa - 11 Years Ago

prabhatM (5/3/2015)

rampa (5/3/2015)

sw00000p (5/3/2015)

Sw00000p has been claiming too often bringing the facial animation to ICLONE through 3DX4. I and the newbies would love to see a sample of that.

Until you characterize it and define certain bones as "facial", your playing prop animation that happens to be in the shape of a human face. So yeah, it's not hard to do. The difficutly comes in making it an iClone avatar, and capturing it in the first place if you don't have the software.

By Rampa - 11 Years Ago

Can you post an FBX of some random bit of a face speaking? I find plenty of BVH examples, but an already functioning face would be fun to experiment with.

By TimV - 11 Years Ago

Hi All:

I may be off down a rabbit hole, but I will let you know. Balabolka will not do what I need it to do. I did find an SYLT editor (called SYLT Editor) which is intended to synchronize lyrics or text to MP3s. According to the documentation, you can break up the text file down to individual syllables if you want to go that far. I have attached the manual as a reference. I think this is going to work.

regards,

Tim

By prabhatM - 11 Years Ago

In India, we have a superstar who uses only FOUR expressions in all his movies and he openly says it.

He says given a dialogue, he decides which combination to use - 4+2+3+1 or 2+4+1+3.

This lipsync looks like that.

Is the quality spurious because it's a rush job or lack of possibilities in 3dX4 to import all the data?

By mark - 11 Years Ago

I'm sure folks are gettin' sick of my repeated example cause it ain't that good. I think I could do a better job now. I recorded a "talk-the-lyrics-in-time" audio file route. It worked OK but with a bit more tweaking the visemes it could have been even better I'm guessing...

By prabhatM - 11 Years Ago

mark (5/4/2015)

Do you think anybody really cares about the LipSyncing in this movie?

The whole thing is so amazing !

By mark - 11 Years Ago

Hee. Hee....Well you're kind but I really was trying to get the lip-sync right :P

By TimV - 11 Years Ago

Hi Mark

Thank you for posting your music video. I agree with the previous posting that it is amazing. There's a great deal of creativity and artistry in it. You have thrown a gauntlet for me so to speak since I seem to be in a bit of a medieval phase right now. My imagination is way ahead of my abilities, though.

I made some decent progress with Audacity, but I am not happy with what I will have to do to stretch the words to match the mp3. Blocking to the start of each line wasn't bad though. I'm going to see what I can do on the text timing with syltEditor. “synchronized lyrics or text”. I did end up down the rabbit hole because sylt introduces the prospect of LRC which was supposed to make my life easier. LRC digressed to support from Sony and the supporting software from Sony re-cataloged all the video, music and photos on my machine for most of the morning for absolutely no good reason. It obviously wasn't and LRC editor :)

I've hit a fork in the rabbit hole which has led me to Subtitle Workshop for the timing implications (and also because I found no way to add an SYLT packet onto my MP3) I've read that once I get my text working, I can run a fake video in Visual Sub Sync (ie my MP3 audio file) to line everything up.

So, well.... I will report back, especially with anything positive.

Thanks

Tim

By TimV - 11 Years Ago

Hi:

Is there a way to easily control the settings of the timeline? My text file is getting cut off by the end of the time line. I can't seem to drag it to extend it, and if I add frames, it doesn't change the fact that my text speech file has already been cut off. How about an easy way to adjust the frames per second? A display anywhere to enter the desired number?

Arrrrg

Thanks

Tim

By Rampa - 11 Years Ago

Hi Tim,
Look in your "Project" panel, under the "Time Unit" section. There is a field where you put in the number of frames you want. I think the Maximum is 54,000, or 15 minutes.

By mark - 11 Years Ago

Now what I did with my vocal sync was to play the song in my headphines while I spoke the lines into the mic and recorded... not worrying about pitch just enunciating the words. I then loading the song into my iClone stage attaching the mp3 file to a prop and then adding my "Vocal-Track" to the avatar.

By TimV - 11 Years Ago

@Rampa

Thanks for the location information. Just what I needed.

@Mark

I might have to break down and buy a mic. I will look into that option.

In the meantime, I am becoming great friends with Audacity. I used a program feature to split my audio track automatically based on "detach at silence". That gave me individual blocks of spoken text that I could then line up with the vocals on the mp3 track. Prior to that I went back to Balabolka and re-read my text file at a slower rate. Some of the blocks are now an almost exact match. With a little time and effort I can output a line at a time, or even a word for areas that aren't close enough after the first pass. I'm thinking this is going to get me the results I want as long as my patience holds up.

Thanks for mentioning that you attached to a prop. That is a great idea. I am attaching my speech to Heidi and the actual mp3 is in as background audio. Attaching the mp3 should give me a sound line for the music which will help me sync things up.

Thanks,

Tim

By prabhatM - 11 Years Ago

sw00000p (5/6/2015)

Is this the song?

Couldn't this be done inside ICLONE without any external help ?

By TimV - 11 Years Ago

Yes, that is the song and the lip-synching is pretty much spot on. If, as I expect, I am about to find out that this was done entirely in IClone6, I have obviously spent time learning things External to IClone6 that I would much rather have spent learning things Inside IClone6.

That being said. I would point to the large friendly "Newbie" that indicates my experience with IClone6 and very politely ask how this was done.

If it was with a mic, I obviously should have bought one of those instead of MediaMonkey which did accept my lyrics but as a whole block of text at the beginning of the track that I have been unable to "sync" with the music. I was going to work on the spoken text file with SYLTEditor today as it allows time stamps that I was imagining would help in Audacity. I had hoped to find a tidy digital solution that I could be sure I could line up accurately at a price that I could afford.

Please advise on how this was done.

Thanks and best regards,

Tim

By TimV - 11 Years Ago

Hi sw00000p

Thank you so much for taking the time to create this tutorial for me. I really do appreciate it. I can see that I have a great deal of learning to do. It's clear from the results that you have shown, that what I wish to do requires an investment in time and patience. As is usually the case, there clearly are no shortcuts if one wants to do it properly and convincingly. I am a big fan of realism and so I will take my time and learn what I have to learn.

Thanks for sharing. I now know how I want to proceed rather than trying to rush my way through.

best,

Tim

By prabhatM - 11 Years Ago

sw00000p (5/6/2015)

TimV (5/6/2015)

Please advise on how this was done.

Thanks and best regards,

Tim

Tim, this is One Way!

I missed the context.
Was this tutorial about "how to do better LipSync in ICLONE?"
Or, doing the LipSync outside ICLONE and then importing the facial animation into ICLONE ?

By TimV - 11 Years Ago

Hi sw00000p

Again, my sincere thanks for your efforts in laying out the process and providing specific, useful information.

It would obviously seem that starting / continuing with the vertex animation would be a good choice since I know little about any animation. I am pleased with the animation that was achieved using a Daz model, exporting the .fbx, and then loading the model into 3DXchange5 (Pro). That provides a decent group of visemes for the facial movements. I see that 3DXchange5 provides the ability to tweak those visemes to enhance how they appear. Not sure about the realism aspect, but it would be a good learning environment I think.

I upgraded my version of IClone6 from Standard to Pro a couple of days ago and haven't done anything with it, but I am hoping the enhanced facial puppeteering will help on the realism side and it does support bone level expression for when I am ready to take that step. I don't know what "detail solo feature control" is yet, but I will find out. It also has more flexible timeline features which I expect will be useful.

At the moment, I do have a spoken word text file that has no background noise. Can I not use the timeline enhancements to adjust my spoken word track's timing to manually sync that file with the actual mp3 of the song? Your sample in this post is perfect from a timing perspective. I would like to have a process that synchornizes my text file with the mp3.

I want to focus on the balance of the facial animation because that is where the reality of a live performance lives. I went to concert and had a good seat to watch Colin Raye. Like most professionals, he does not sing his songs. He lives his songs from the heart. That's what you don't see when you buy an MP3. That's what I want in my animation. I think the aspect of timing the lip-sync should be like putting a canvas on a stand. Applying the paint is where the magic lives.

You mentioned a package in a previous post and I will go back and see what it was after I post this. The implication that I got out of your note was that this software was capable of preparing the canvas. (so to speak by my analogy above.) Is that the case? and are you aware of any other packages that might be a little more affordable. I think you noted that it was around $1000.

Can't wait to hear your thoughts on IClone6 Pro and if I can do enough tweaking on the timing there. Then I could focus on adding bones to handle the facial animation.

best,

Tim

By prabhatM - 11 Years Ago

Tim,

With the help of 3DX and ICLONE you can create reasonable good Lipsyncing.

1) Start with the basic lipsyncing
2) Play around and Practice to improve in ICLONE.
3) You may have problem getting the facial expressions from other softwares to ICLONE. So stay focussed on ICLONE.
4) If you keep switching between Maya, Blender, 3DX MAX, DAZ and ICLONE, then you will never be able to complete a project.

I just read your posting. I guess you would do fine with "JUST" 3DX and ICLONE.

Please don't set yourself on a wild goose chase when you are LEARNING ANIMATION.

By TimV - 11 Years Ago

Thanks prabhatM

I think the first thing I want to do is investigate the features in IClone6 Pro.

If I can control the timing on the text speech in there, to line it up with the MP3, I think that is a good place to start.

If I can't get what I want specifically, I know I can get pretty close with lining up blocks in Audacity. Usually, with this particular song, I can put two lines in a block and sync that block to the audio track of the song and get very close.

Since I now have a good grip on editing in Audacity, I'm comfortable. I have also discovered the art of creative spelling in my spoken text file. For example, matching the word "perfect" with the timing doesn't work well. Matching "purr fect" works very well. Some words are spread over several beats, so that will be more challenging. Destiny overlays much better as "desk in he"

I'm sorry I said anything about $1000. I went back and looked at sw00000ps post and didn't find anywhere where it said that.

best,

Tim

By Rampa - 11 Years Ago

Good call on the upgrade, Tim! Many of us just assume everyone gets "Pro" to start with, so we would end up giving advice that wouldn't make sense.

Yes, with the full timeline, you will be able to cut-up and stretch your audio just as you need. The "Facial Puppet" and "Facial Key" windows are also very powerful, and allow for animation of facial features both in groups and separately. You will also be able to change/add/delete the visemes. For those longer beats, sometimes just re-positioning a viseme a bit later in the timeline is enough.

That fancy stuff sw00000p was showing you from Max can also be done right in iClone (Pro), with the advantage of it being on an iClone character which you can further animate.

Good luck!

By TimV - 11 Years Ago

Hi Rampa:

Thanks for your note and the info on the pro version. Now I am excited

Tim

By prabhatM - 11 Years Ago

sw00000p (5/7/2015)

I use iClone to "Render." I import everything!

That sounds very interesting.

How do you import a Pre-Composed Multi-actor scene with body and facial animations into ICLONE?

By TimV - 11 Years Ago

@sw00000p

Thanks for clarifying your process. I will certainly keep heading toward the facial bones. In the meantime, my time is available for learning whatever I need to build a foundation that I can understand. I would love at some point to be able to do this as a paid activity, but

First you get good, then you get fast. It's a quote from somewhere but I'm sure of the source. (I heard it in M.A.S.H.)
@all
I was writing this to find out how to make the "break" icon work on the sound file, but I tried something I thought I had done before which was to left click on the file in the actual viseme line and then click the icon and it worked, so I suspect that I had not actually tried that before. I think I was trying to "break" the voice track, but it wouldn't break.

Is there a way to display the waveform in the Sound line under Project like it is displayed in the Voice line under Viseme. That would be helpful for me be able to see where the silent spots are, to help line up the two tracks.

Thanks,

Tim

By TimV - 11 Years Ago

For the benefit of any other newbie reading along.

I have learned that I do not need to use Audacity. Too bad, it was such a good price. :)

I have loaded the file into the Modify panel in .wav format, and I have also copied the text into the TTS window. Both ways allowed me to enter the file. I'm not sure they both put the file in the Viseme / Voice track, but it is there now and I am happy. The actual mp3 is in the Project / Sound track, so all is well.

It looks like I will be ok adjusting the speed of the text track section now that I have figured out how to break up the file.

I am trying to use Flags in the project track to line up where my voice starts for the lines of the song. Im finding it challenging to find the right spot. Once I get that in place, I can use it as the indicator for where the spoken text track should start up for the next section / (line or verse)

If I hadn't gone through what I did to get to this point, I wouldn't have the understanding of what I'm trying to do in a framework that I can now make use of in IClone6.

All of the help and suggestions have been very useful to me and I thank you for the hours of frustration that you have saved me.

best,

Tim

The edit was to fix a missed letter, but I just realized that this learning exercise has driven my promotion from NewBie to Junior Member. That actually seems true:)

By Rampa - 11 Years Ago

I think I can help you with the alignment thing, a little anyway. As you noticed, the only wave-form display is for "voice". If you want to have both the music wave-form, and your TTS wave-forms and visemes all visible together, then assign the music track as the voice of a second avatar. Since you can have everybody's tracks open at once in "Pro", open your voice track, and the music track. You can close all the tracks and sub-tracks you don't need at the moment so that you can have both wave-form tracks together.

By TimV - 11 Years Ago

rampa

Ohhhhh, thank you. That's brilliant!!! I will try that. I had actually gotten pretty good with the flags and broken up voice text file. I still have more to do so I will try your suggestion.

Thanks again

Tim

By prabhatM - 11 Years Ago

sw00000p (5/7/2015)

prabhatM (5/7/2015)

[quote]sw00000p (5/7/2015)

I use iClone to "Render." I import everything!/quote]

That sounds very interesting.

How do you import a Pre-Composed Multi-actor scene with body and facial animations into ICLONE?

Look! I don't want to argue nor debate YOU!

I take what I need from iClone and various other apps. into:

1. MotionBuilder
2. 3ds Max
3. Endorphin
4. REALFLOW
5. TopoGun
6. After Effects

Pre-Composed... this or that BS doesn't bother me. I tear it apart and make it the way I WANT IT.
_________________________________________________________

Import and render what I need to iClone .... then finish in Post... with AE and Premiere Pro.

@Tutor !!!
===============

First you say you use ICLONE ONLY FOR RENDERING and IMPORT EVERYTHING.
When we ask you to clarify your IMPORT ( INTO ICLONE ) process, you start talking about the EXPORT ( OUT OF ICLONE ) process !
In your High-end Process, are they same ?:D

By TimV - 11 Years Ago

Hi:

@rampa
Thanks again. I have added another character and exported the mp3 of the song as a .wav file. I have that set at the Viseme track for the second character so I can now see my waveforms for both tracks.

@all

I have taken another little side trip, but I really believe it is important and worth the effort.

I did a lot of work last night on the overall timing, and with breaking the spoken track to give me the flexibility to line up the lyrics line by line, or phrase by phrase. That was the original "final" plan.

That did bring some of the text into a decent synchronization between the words and the song. Adjusting the text track segment lengths of the broken up text .wav speech also closed a number of out of sync areas.

It doesn't look close enough. I still have out of sync areas that look like watching old kung fu movies - english sound stops, characters keep talking.

In my case, since the language is the same on the two tracks that I want to sync, some of that mismatch is phrasing, while some of it is syllable stress and some of it is pronunciation. There is also the creative aspect of singing where words are stretched or altered to fit the beat of the music. Playing with the length of a track segment that represents a line of the song may allow for end to end synchronization but how the line is spoken becomes apparent.

To that end, TTS sytems allow control of the flow of the speech, enunciation, pronunciation, delays, voice speed and pitch variances, etc. This is done through the use of XML tags embedded in the text. I have experimented with adding these tags in my speech text prior to having the software create my .wav file of the lyrics. I'm stumbling at the moment, but I will certainly get better at it. The results are most encouraging.

I have been able to match the way the words are sung in the song by using tags to change syllable stresses and by specifying pronunciations. The plan is to get the spoken file as close as I can, create a new .wav file and bring that into IClone6. From there, when I break the spoken file into segments, I believe I will be able to very close to an exact match.

I expect it will take some time, but I will post the result in a small sample which I hope will demonstrate the effectiveness of the process.

best,

Tim

By prabhatM - 11 Years Ago

sw00000p (5/8/2015)

prabhatM (5/8/2015)

First you say you use ICLONE ONLY FOR RENDERING and IMPORT EVERYTHING.
When we ask you to clarify your IMPORT ( INTO ICLONE ) process, you start talking about the EXPORT ( OUT OF ICLONE ) process !
In your High-end Process, are they same ?:D

Ummm! Clarify Import, OK! I use 3dxchange.
"Plain Simple English!" :pinch:

===========================
You famously said you IMPORT everything and use ICLONE only for the rendering.
===========================

What all do you IMPORT through 3dX for RENDERING in ICLONE ?

Do we presume that you set your SCENE in another software and IMPORT it to ICLONE ONLY FOR the RENDERING ?

By TimV - 11 Years Ago

@sw00000p and @prabhatM

I'm not sure if there are any moderators. I'm not sure how long you two have been feuding. I'd like to ignore it and pretend that it isn't happening.

What I do know is that the conversation that you two are having in MY thread doesn't have anything to do with MY issue and is not helping ME or ANYONE else make any progress.

If the two of you have any further suggestions that would help ME, by all means, please share.

If you really want nothing more than to slug it out with each other, might I kindly suggest an exchange of email addresses might provide you with a direct link to each other that wouldn't be interrupted by my posts of what I am trying to do and how I am approaching it.

It has been my hope that someone would pick up on something I was saying and think of something to share with me that might be helpful. Rampa's post is a perfect example where he suggested adding a second character in order to be able to assign the full version of the song to a voice channel where I could then see the wave form.

If there are any suggestions on embedding MAPI5 tags in my text file, or other ways of manipulating that file that can impact the timing of the words, THAT would be very much appreciated.

Thanks,

Tim

Edited to add the I for If

By prabhatM - 11 Years Ago

sw00000p (5/8/2015)

1. I work on animated characters.
2. I work on Props... some animated.
3. I use 3dxchange to import.

What part of this you don't understand?:w00t:

Then you do a whole lot in ICLONE than just rendering. You set up these animated characters and props. You set up the scene and interactions between the elements. It's not just bringing in a WHOLE PRECOMPOSED SCENE TO ICLONE AND JUST RENDERING IT.
This is not just clicking the Rendering Button in ICLONE and ..................... Poof ...........you are done with ICLONE !!!

And you are not alone. Most people do that in ICLONE. They import many individual elements to ICLONE and work.

When you say :" I just use ICLONE for Rendering and IMPORT everything" , you almost sound as if ICLONE is GOOD FOR NOTHING AS FAR AS YOU ARE CONCERNED and only attractive feature that it has is RENDERING. You use ICLONE just for a few seconds.

You almost made the World believe that you have a SECRET WEAPON / ALEMBIC IMPORTER FOR ICLONE through which you can IMPORT the whole scene with a single click and then just click the ICLONE render button....Poof !!!

You offer out of context suggestions and blatant wrong impressions to the newbies who look for sincere guidance.

You take them on an unnecessary wild goose chase. Do that with your clients. They might get impression that you are an EXTRAORDINARY TALENT, and their only hope, and may dole out a few extra bucks. Don't do that with the newbies here on the forum.

Resist that. DO NOT SAY WHAT YOU DO NOT INTEND TO SAY. YOU ARE A TUTOR. YOU MUST KNOW WHAT TO SAY AS A PART OF YOUR REPLIES.

@TimV

You see a few, perhaps 10-15 people being active on the forum. Does it mean ICLONE has only these 15 users all over the world ? Does it mean that only these 15 people have problems with ICLONE ?

No. 98% people do not ask questions. They search the forum for the answers to their problems, read diligently and learn.

So your question actually does not remain yours once you put on the forum. You represent a whole lot of people who have similar issues. When the problem is resolved, it is resolved not only for you but also for whole lot of people out there.
In any reply, every sentence is important, particularly the conclusion part.

People may offer TENTATIVE SOLUTIONS. That's OK. They are just offering suggestions and may not be perfect. They are participating in a debate with their suggestions. They can make mistakes. No issues.
But.....

When somebody PRETENDS and POSES as an authoratative Industry Expert and offers a DEFINITE ANSWER, he must not BS and be careful what he says.

Hope you got the context of this debate. There is nothing personal about it.

By planetstardragon - 11 Years Ago

Good to see a 'same old' feud that didn't involve me for a change, good call Prabbhat,

By TimV - 11 Years Ago

@prabhatM

Thank you for taking the time to explain the reasons for your responses and the background behind your actions. I did actually get the point that you were clarifying that it is not as easy a "drop it in and Poof", the miracle happens.

I have spent countless hours in forums for Daz, Hexigon, Blender, Vue, Carrara, Gimp, and 3d_Coat. Sometimes I just read, without a specific topic in mind, purely for education. Often I get good information and then it is a matter of figuring out how a different program does it. What I have learned is that while some programs may be better at some tasks, there is no one program that does it all.

In my opinion, IClone6 Pro is the best product I have come across. I am truly excited to work with it. I have actually read the entire on line help file. There is still a gap, and it is on my side of the fence. The folks who write these files know how to do what they are explaining. Until I get a frame of reference, I may not understand something that is clear to the author or others.

I pick something as a project that I want to do and then stick with it, one step at a time, until I get as close as I can. I will almost always find myself on side trips down rabbit holes, because I believe that others are trying or will try to do the same thing and there might be something out there that might save me a few hours or frustrations. For example, Audacity would let me strip the vocal out of my mp3, but it means that I need the instrumentals from the song in order to cancel out the instrumentals in the mp3. Makes sense. The reality is that I am not in a position to get the instrumentals, so that solution is not an option.

I really hoped that I would find some tools that would allow me to write a step by step account of how to extract the vocals out of the mp3 and use them to drive the Visemes with almost perfect timing. I think that would be a great contribution to the forum. It's something that I would really have liked to discover.

When I am finished (or as close as I can get), I will provide a summary, including which software has been helpful in getting me closer and what that software allowed me to do. I believe at this point, that I can get close enough that I will be left with adjusting the the lips track of the Viseme manually to finish the timing.

I am of the opinion that the best suggestion was given to me on the first page of this post. I can't quote it exactly, and I will credit the post in my summary since I didn't note the name before I started this post and can't get back there while I am in it. Get a mic, practice the song and record your own vocal track. It can be done entirely within IClone6.

I have learned a great deal by researching possible options and actually working with the results in the software. My time is the price I pay for education and it is always a good investment.

I thank you for standing up on my behalf although I regret being the cause.

best,

Tim

By prabhatM - 11 Years Ago

TimV (5/9/2015)

I have spent many years in elearning and advertising. I believe in Orthogonal Projection of information when it comes to learning - Straight without any Noise so that the learner gets it immediately. We have to stay focussed on the contextual use of the information.
In marketing, one could use all kind of tactics to play up some peripheral information to projects one's image. But while helping / tutoring somebody one must stick to the point and maintain Brevity.

That was my point.

Besides, my mind is trained to detect deviations. Many years of serious training wouldn't go away easily. Even a very casual side glance will catch a deviation. That's why wrong, out-of-context or meaningless padded info catches my attention immediately. That's the only way I could manage a large production floor. And the habit has stayed with me.

I do not react to Sw00000p's padded info when he deals with the senior guys. But I find it difficult to ignore the same when a newbie is involved.

My job is to point out the issues to him. He will take a while to change. And that's fine.

He is helpful. just that he needs to change his style when he is in a peer group. There is no personal issues between us.

By TimV - 11 Years Ago

Thanks

I need to work on the brevity part and let people ask if they didn't get the meaning of something I said. That's not a response to anything you said. It is merely an observation on my part.

I picked up a mic today. In fact it's a head set. That way I can get my voice track in while listening to the song and not have the song get picked up by the mic. It means I have to rewire my PC. I have a good PC based surround sound speaker setup, but it's hooked up to a different PC which is set up with a touch screen just for music in another room. My "work" PC is hooked up to my receiver, and I can't reach the back of the pc for the headset.

Always something :)
best,

Tim

By planetstardragon - 11 Years Ago

For the audio part just listen to your audio in headphones while you record on the mic.

extracting vocals from mixed audio can be like unmixing coffee and milk, the general process is to filter out all frequencies around the voice and using phase cancellation- the problem is the voice shares many of the same frequencies as the music itself so it will never be a perfect extraction - you can do the filtering with specialized eq's but finding a phase shifter that allows variable settings is more difficult, some of the units capable of doing this efficiently are expensive and usually come in the form of hardware. ie - the Edison The Edison - you can come close but not without glitches - which usually require another step of rebuilding to fix. It's a technique producers and DJ's use when they want an acappella to remix , the glitches aren't a big deal because they are often covered in the mix with new music and effects....but as acapella ..it's less forgiving. it's more work than what it's worth and you will never have a perfectly clean vocal,

By TimV - 11 Years Ago

@planetstardragon

Thanks for the input on unmixing milk and coffee. I like that analogy, and that is how I felt after all my searches. I found some tutorials using Audacity with a serious presentation of copying the track and flipping it into a mirror image of itself to remove the music. Two things struck me right away. I would expect that the inability to mask out any frequencies beforehand would mean that the final result should cancel the original and produce silence. That at least is what my limited exposure to physics would suggest. The second thing that struck me was that in their serious proposal, they had completely missed the irony of what they were suggesting based on what the software is called.

Thanks

Tim

By planetstardragon - 11 Years Ago

@Tim, yeah that's why the better phasing software is more expensive, they are able to niche out frequencies with heavier processing, where as the audacity approach changes the phase of the whole audio spectrum, rendering the whole process moot. To do a decent job it's a process of first removing all you can, then rebuilding what got damaged in the process....so it never really sounds like an original acapella no matter how well you do it. It's just the nature of the beast.

By prabhatM - 11 Years Ago

For vocal separation, try ADX Trax.

You may not have a better tool than this.

By planetstardragon - 11 Years Ago

swoop has amazing knowledge, he just has an appetite for food fights lol

i created a new icon, less fighting for me these days -hands swoop a daisy-

By prabhatM - 11 Years Ago

sw00000p (5/10/2015)

TimV,

ADX Trak $300 bucks. Good Tools!
Can the person whom offered this $300 dollar purchase... TEACH you how to use it?
Let's hope so.:)

For your kind information, I set up first Audio studio in 1975 while studying in school. I used 8 track Nagra Spool for professional recording. Designed a cassette duplicating machine ( 1 to 250 ) by connecting cheap decks as I could not afford high speed duplicating machine. I hired guys playing in wedding bands and graded them ( A/B/C category ) and fixed the studio rates. That system is still running my old home town.

After decades, I showed interest in music just a few months back. Now I do full instrumentation on my computer. I create beats by playing on the table / bucket and WHISTLE the tune and then SAMPLE them with real instruments ( percussion, trumpets and strings etc.)

Here are few tracks. Many more to come in a few weeks :

https://soundcloud.com/search?q=prabhat%20mohanty

If you have a good ear for music, listen carefully for the complex layering.

I am not a Programmer, yet I do very complex programming for my projects that can compete for VC's fund globally.
I am not a writer, yet I do all my screenplays,
I am not a poet, yet I wrote all songs for my projects.
I am not a musician, yet I create all the tracks for my projects.
...and there are many facets I am yet to discover.

By prabhatM - 11 Years Ago

sw00000p (5/10/2015)

prabhatM (5/10/2015)For your kind information, I set up first Audio studio in 1975 while studying in school. I used 8 track Nagra Spool for professional recording. Designed a cassette duplicating machine ( 1 to 250 ) by connecting cheap decks as I could not afford high speed duplicating machine. I hired guys playing in wedding bands and graded them ( A/B/C category ) and fixed the studio rates. That system is still running my old home town.

After decades, I showed interest in music just a few months back. Now I do full instrumentation on my computer. I create beats by playing on the table / bucket and WHISTLE the tune and then SAMPLE them with real instruments ( percussion, trumpets and strings etc.)

Here are few tracks. Many more to come in a few weeks :

https://soundcloud.com/search?q=prabhat%20mohanty

If you have a good ear for music, listen carefully for the complex layering.

For your kind information, I have no interest in this.
1. You offered the OP a $300 audio editing software.... Can you teach the product YOU offered?

You mean to say you are going to teach him how to click a button in this software ?

Only the Dumbheads and fools brag like that.

Stop losing your self-esteem, if you have any left.

By planetstardragon - 11 Years Ago

so, i found the perfect icon for you guys

swoop - you wear this one

http://www.bartvermijlen.com/wp-content/uploads/2014/12/the_ugly_baby.jpg

and prabbhat you wear this one

http://images.boomsbeat.com/data/images/full/500/tumblr-jpg.jpg

By TimV - 11 Years Ago

My Thanks to all!!!

Maybe I am missing something, but from what I saw, I left with the impression that ADX Trax was Mac based, which rules it out for me. I also noticed that it was subscription based, and cloud based.

Now, yesterday I invested a whopping $12.00 for a standard headset. Ok, I need to spend another $5.00 to get a mini plug extension for the headphone piece. That jack is on my speaker system and not the back of the PC where the mic is.

That's less than the $19.00 I would have to pay for 1 month of ADX Trax, and I'm not sure I can buy one single month. I am working on a one time project, which may or may not lead me into doing more of this type of thing. For certain, I can see me creating short films or maybe a TV type episode based series, but I don't see that involving a lot of singing.

I did buy TextAloud 3 audio software because the ability to handle pronunciation was better there than I saw anywhere else.

I can't see investing $300 until I know I am going to be using it a lot. If it is Mac based only, I would have to add the price of a new system and the math is just not working for me.

I will get my mini plug extension cable today and see what I can do with a mic and the original mp3. After all, my goal is to lip-sync a character. I am more interested in the animation aspects right now than I am in the music.

best,

Tim

By prabhatM - 11 Years Ago

There are many music oriented softwares which are mac based.
Check out the market. Somebody nearby might be having this. You can pay for its one time use.

If you are serious about producing a series, my suggestion is you go for real audio rather than TTS. If you contact a recording Studio, they can produce the audio for your series for a reasonable price. They can arrange the voice talents too.

By prabhatM - 11 Years Ago

Sometimes I wonder why does one have to use TTS for animation work ! Even if it's only a hobby !

To convert your PC to a good DAW, all you need is a USB based OMNI DIRECTIONAL DYNAMIC MIC ( Avoid Condenser mic ). This should be around $120 or so. Then you need just AUDACITY, a FREE programme. This has good Background Noise Cancellation feature.
Then invite your family and friends to lend you their voice. Even if they are not professional, they will start picking up the thread after 2-3 takes.

Their voice would be so much better than these aweful TTS, people tend to struggle with !

It's much easier than you think. Start with a $2 mic ! But start playing with the real voice. Dump TTS.

By TimV - 11 Years Ago

Hi:

Thanks to all on the input regarding voice source. The clear message is to avoid TTS in favour of actual voices from real people. I would think that puts me more in the role of director. Wouldn't mean I should be attending the "readings", or do I need to provide more of a screen play / story board to describe what I am looking for ahead of time?

My original thought was that I would use the TTS voices and then put them through a Voice Changer Software package to create different "characters". The characters would be saved and then I would call them up to feed them their lines as needed. Any thoughts on voice changer software or the feasibility of this as a process?

Thanks

Tim

By prabhatM - 11 Years Ago

TimV (5/11/2015)

Voice changing sw works pretty well for Robots, Aliens etc...

But you would waste lot of time doing TTS and the Voice Changing software for projects dealing with real people. The outcome is not worth it.

Go for the proven method. Do the way professionals do. Take your first baby step - setup your own audio studio in your study room. Cover the windows with blankets to muffle the noise. Record the real voices with a DYNAMIC MIC. You will be better off.
If you want to complete a movie in real sense, please take the proven path.

By mark - 11 Years Ago

Now I didn't do a real good job with this but you might be surprised by what you can do just changing your own voice in subtle ways to achieve different
characters. I did all the voices for this short..as I said with varying degrees of success...:blink:

By prabhatM - 11 Years Ago

mark (5/11/2015)

Yes, voice changing software works nice with real voice- only if you are good with modulation. I too started thinking in those lines. But when you hire 8-10 voice talents, you might as well hire all of them and do it at one go. Depends on the project size.

If you tempt guys to use artificial voices for their movies, then you have to help them with the visuals also. With all your arresting visuals, people never notice other things.

We don't have your talent.

By prabhatM - 11 Years Ago

sw00000p (5/11/2015)

: (Omin Directional Mic) and Audacity!

Please use OMNI DIRECTIONAL DYNAMIC MIC. Not a Condenser Mic.

The condenser mics are better. But they need professional sound booth. Else they will pick up noise easily.
If you are setting up a home audio studio, go for a USB based OMNI DIRECTIONAL DYNAMIC MIC and patch with Audacity.

Even better, for only voice work, go for the head set people use at the call centres. They have built in Noise Cancellation features.
Once recorded, play around a bit with the gains, bass etc on Audacity.... and as you say.....Poof !!!

Infact we use all kind of Mics...cheap $1 to $1000 mics. Depending on the characterisation. Never throw even a cheap or even slightly defective mic. You might be surprised how it can give the right voice quality given a particular character !

By TimV - 11 Years Ago

@Mark

Wow. That is an amazing video.!!! I think I can handle creating a few different voices. I know what mic to look for. I suspect I will get a gaming headset since it looks like the desired features are built in.

I'm caught on something that seems to me like it should be really easy, but I can't make it cooperate and there isn't anything in the help file that helps. (sorry, couldn't resist)

I have my voice recorded track in the Viseme of my primary character. I have a secondary character and I want to copy the track so that they both have the same one to drive the lip-sync. They are the same character at different points in time which I will do by creating a camera for each of them and then select the cameras accordingly in the switcher track.

Copy and Paste work fine, but it only pastes in the same track it copies from. Do I have to create a second track with the mic like I did the first one?

Thanks

Tim

By mark - 11 Years Ago

Not sure if this is what you're asking but of course... you can load the same audio file into two different avatars. Then cut, paste & delete the appropriate segments.

By TimV - 11 Years Ago

Hi Mark

Thanks. That's not really what I wanted to do. I wanted to copy the audio track from Avatar1 to Avatar2 so that they both had the same track. I had recorded my track within IClone6 to add it to Avatar1, so I had no file to add to Avatar2.

Since I posted, I re-did my vocal lip-sync track in Audacity in order to get an actual .wav file. I then loaded that file into both of the avatars and synced them up. It worked, and now I have a timing file saved external to IClone6.

I still think there should be a way to copy the Viseme track from avatar1 to avatar2 within IClone6. Am I being unreasonable or missing out on how to do it?

Thanks,

Tim

By Rampa - 11 Years Ago

TimV (5/11/2015)

There is!
It's slightly different, in that there is no collect clip involved, but try this.

Select your speaking avatar.

In the content manager (template or custom), select the animation tab and the "Facial" folder.
Hit the "+" at the bottom of the content manager and it will save the entire track's facial performance and audio to your custom tab.

By Rampa - 11 Years Ago

@sw00000p
It'll work the same as iC5, and iC4 AFAIK. I'm assuming your just using an animated prop rather than a characterized iClone avatar.

If you would share the model as you've created it, we could probably figure out a whole process faster. The difference will be in setting it up in 3DX, not iClone.

I also think he has it just fine. Remember he's figuring out how to do it so it makes sense for him, and he does not have Max. In fact, he would rather do it iClone, and already has.

By animagic - 11 Years Ago

rampa (5/11/2015)

TimV (5/11/2015)

You can also save everything as MotionPlus clip. That will include the facial performance and everything else.

By TimV - 11 Years Ago

Thanks to all for the support and the suggestions.

As stated at the onset, I only ask to be pointed in the right direction.

@rampa

Of course it worked!!! +2 for my understanding file. I have two talented singers with functioning lip-syncing that I can live with. I will have to do some tweaking, but that was always expected. It's another learning opportunity so I am looking forward to it.

@all

Right now, looking slightly ahead, I think the character movements are going to be a real challenge. I think I will try poses along the timeline. The Viseme .wav file is very pronounced and easy to follow for the lyrics, so I will set my poses according to that timing and then look at the facial and other movements to add more detail or transitions.

Further ahead, I'm thinking of two sets in separate areas of the grid, each with their own camera(s). One set for the past, and one for the present.

Of course, suggestions and ideas are always welcome, especially if I appear to be headed for resource issues or technical aspects that I just can't see yet.

Thanks and best regards,

Tim