I don't have CTA (iClone user), but this may be a way to accomplish what you want.
It is helpful to see the waveform when you're animating, but you can't see those for music or sound. What you might be able to do (you can in iClone) is having a dummy character and give him the music track as speech track. That will allow you to see the waveform. As you have separate files for the music parts, you can have a number of dummy characters and animate the different parts in isolation. I don't know if CTA 3 has sound scrubbing of the speech track, but that would also be of help. It's probably best to do one part at the time, save it, remove the speech track for that part and move on to the next one.
This is a somewhat unusual approach, but years of iClone taught me to find workarounds...