Advertisement

Speech Synthesis

Started by September 07, 2000 08:30 AM
7 comments, last by Yanroy 24 years, 3 months ago
I want to make a speech synthesis program. Of the kind that reads text to you. I think I have an idea of how to do this. I just need a couple bits of information (maybe megabits ) Where can I get a list of every sound in the English language and the rules of when to use it (aka thing like "I before E except after C unless sounding like AY as in neighbor and weigh"). You know... all those rules that the English teacher makes you learn in elementary school. And I need the sounds so I know what to record as the WAVs. Second question: Is the PlaySound() function (I think thats the name) fast enough so that when you play a sound in blocking mode (the opposite of asynch?) and immediately play another one, can the user hear a pause while it loads the sound from a file? I want the computer to say "Hello" not "H-e-l-o" --------------------

You are not a real programmer until you end all your sentences with semicolons;

Yanroy@usa.com

Visit the ROAD Programming Website for more programming help.

--------------------

You are not a real programmer until you end all your sentences with semicolons; (c) 2000 ROAD Programming
You are unique. Just like everybody else.
"Mechanical engineers design weapons; civil engineers design targets."
"Sensitivity is adjustable, so you can set it to detect elephants and other small creatures." -- Product Description for a vibration sensor

Yanroy@usa.com

You want phenomes :

PhonemeList = ( 'U', 'A', ' ', 'B', 'D', 'G',
'J', 'P', 'T', 'K', 'W', 'Y',
'R', 'L', 'M', 'N', 'S', 'V',
'F', 'H', 'Z', 'AW', 'AH', 'UH',
'AE', 'OH', 'EH', 'OO', 'IH', 'EE',
'WH', 'SH', 'TZ', 'TH', 'ZH' );

Instead of rules, programs like this usually use dictionaries of how the phenomes are strung together ... Look here for one :

ftp://ftp.cs.cmu.edu:project/fgdata/dict
(Latest seems to be cmudict.0.3.Z)

And don't use wavs - you can generate the phenomes on the fly ... even through the PC speaker

Edited by - morfe on September 7, 2000 9:46:18 AM
"NPCs will be inherited from the basic Entity class. They will be fully independent, and carry out their own lives oblivious to the world around them ... that is, until you set them on fire ..." -- Merrick
Advertisement
"I want to make a speech synthesis program. Of the kind that reads text to you."

Hehe, remebers me of the good ´ol dos-soundblaster 2.0-talker....How funny it was to let it read German Texts
(It was the English Version of the app - you understand).

Greets,XBTC!
i don''t think PlaySound() will always break up the sounds, but it seems likely that it''ll happen a bit.. think about it: it loads a sound, plays it, releases it, loads, plays, releases..
to make the h-e-l-l-o thing not an issue, you could possibly compose it all to some kind of buffer before you play it.. that''d work great for short phrases/words, probably not so great for paragraphs..

-------------------
LPUNKNOWN pUnkOuter
------------------------IUnknown *pUnkOuter"Try the best you cantry the best you canthe best you can is good enough" --Radiohead
You want to build a data stream and send it to the sound card continuously. It''s a great excercise anyway (finally wrote a library class that does it for me). Look at using the waveOut family of win32 functions (waveOutOpen, waveOutPrepareHeader, waveOutWrite, waveOutClose). Fun stuff.
quote: Original post by morfe

Look here for one :

ftp://ftp.cs.cmu.edu:project/fgdata/dict
(Latest seems to be cmudict.0.3.Z)


There doesn''t seem to be anything at that address. I tried just ftp://ftp.cs.cmu.edu but the folder project was empty. Actually, all the folders were empty.
quote:
And don''t use wavs - you can generate the phenomes on the fly ... even through the PC speaker


How would I go about doing this?


--------------------


You are not a real programmer until you end all your sentences with semicolons;

Yanroy@usa.com

Visit the ROAD Programming Website for more programming help.

--------------------

You are not a real programmer until you end all your sentences with semicolons; (c) 2000 ROAD Programming
You are unique. Just like everybody else.
"Mechanical engineers design weapons; civil engineers design targets."
"Sensitivity is adjustable, so you can set it to detect elephants and other small creatures." -- Product Description for a vibration sensor

Yanroy@usa.com

Advertisement
I know it''s a really evil thing to do, but I really need an answer, at least to where I can get that phenome list So I am just posting this to keep the thread alive.

--------------------


You are not a real programmer until you end all your sentences with semicolons;

Yanroy@usa.com

Visit the ROAD Programming Website for more programming help.

--------------------

You are not a real programmer until you end all your sentences with semicolons; (c) 2000 ROAD Programming
You are unique. Just like everybody else.
"Mechanical engineers design weapons; civil engineers design targets."
"Sensitivity is adjustable, so you can set it to detect elephants and other small creatures." -- Product Description for a vibration sensor

Yanroy@usa.com

Ok, you evil person you, look here first :

http://www.cgocable.net/~jrussel/voicelab.htm

-------------------------------------------------
Mindphuq Software : "Who do you want to do today?"
"NPCs will be inherited from the basic Entity class. They will be fully independent, and carry out their own lives oblivious to the world around them ... that is, until you set them on fire ..." -- Merrick
And while you're at it, you might as well have a look here, mainly for its implementation of ASCII phenomes with RSynth (which is free to D/L) :

http://wwwtios.cs.utwente.nl/say/

Edited by - morfe on September 9, 2000 11:27:13 AM
"NPCs will be inherited from the basic Entity class. They will be fully independent, and carry out their own lives oblivious to the world around them ... that is, until you set them on fire ..." -- Merrick

This topic is closed to new replies.

Advertisement