Advertisement

Analyzing mp3-file for lipsync ?

Started by May 09, 2004 10:11 AM
3 comments, last by _Danneman_ 20 years, 8 months ago
Hi guys, I have a loose idea in my head and I would like to check with some experienced sound-gurus (thats you, if you didnt get that ) if its even possible before I go ahead with all the research and learning and constructing. Step 1) I want to check an mp3-file (voice talking) for a predefined set (lets say five) of phonetic sounds. It shouldnt matter who''s voice it is, or what volume or speed it is spoken with. Is it possible to distinguish these phonetics with for instance, Cubase? Step 2) I want to export every distinguished phonetic sound into a simple textfile. Every phonetic sound should be represented by a number (1-5), and the mp3-file should be checked four times every second for the current phonetic sound. Meaning that if I had a five second mp3-file I would end up with a textfile containing a string with (4 times/second * 5 seconds =) 20 numbers ranging from 1-5. Is this possible? Or am I wasting your/my time? Thanks
------------------------Why of course the people don’t want war. ... That is understood. But after all it is the leaders of the country who determine the policy, and it is always a simple matter to drag the people along, whether it is a democracy, or a fascist dictatorship, or a parliament, or a communist dictatorship ...Voice or no voice, the people can always be brought to the bidding of the leaders. That is easy. All you have to do is to tell them they are being attacked, and denounce the pacifists for lack of patriotism and exposing the country to danger. It works the same in any country. [Herman Goering]
have hope... i know there''s a program out there that does this, but i have no clue what the name of it is. it was free too, that or shareware/trial something.

keep searching.
"I never let schooling interfere with my education" - Mark Twain
Advertisement
1) Yes I do believe it''s possible, but I don''t know how

2) That isn''t the best file format. It would probably work better if you had lip sync stuff that lasted for any length of time (measured in samples or milliseconds), rather than just exactly one second (which is fairly slow in terms of how fast people talk).

Perhaps a list of "length of time" "phonetic" pairs?
Andrew Russell:
Yes, thats probably a better solution in the long run, if and when I take the rudimentary system I have in mind to a higher level.

Rocket05:
Oh, a program like that would be great. Im going blind from googling after something like that and trying different downloads from download.com, so if you remember the name of it, please let me know.
------------------------Why of course the people don’t want war. ... That is understood. But after all it is the leaders of the country who determine the policy, and it is always a simple matter to drag the people along, whether it is a democracy, or a fascist dictatorship, or a parliament, or a communist dictatorship ...Voice or no voice, the people can always be brought to the bidding of the leaders. That is easy. All you have to do is to tell them they are being attacked, and denounce the pacifists for lack of patriotism and exposing the country to danger. It works the same in any country. [Herman Goering]
I remember hearing of a program like that also, but once again, I''m not too sure what it was called.

Some thoughts on it though: the file you use (I''m geussing at this) would prolly have to be speech only. Seeing as a computer isn''t quite smart enough to recognize the difference between a distortion guitar and a persons voice. Everything is just a big binary waveform that it analyzes within preset parameters.

This topic is closed to new replies.

Advertisement