Tag: Speech Synthesis

Speech Synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal language text into speech; other systems render symbolic linguistic representations like phonetic transcriptions into speech. The reverse process is speech recognition.

Synthesized speech can be created by concatenating pieces of recorded speech that are stored in a database. Systems differ in the size of the stored speech units; a system that stores phones or diphones provides the largest output range, but may lack clarity. For specific usage domains, the storage of entire words or sentences allows for high-quality output. Alternatively, a synthesizer can incorporate a model of the vocal tract and other human voice characteristics to create a completely “synthetic” voice output.
—Wikipedia, “Speech synthesis”
(See this article for more on history, synthesizer techniques, challenges, dedicated hardware, hardware and software systems, text-to-speech systems, speech synthesis markup languages, and applications.)

3D sound wave, illustration - Credit: Getty Images

Computing & Technology / Articles

Raising Robovoices

“If you just chain together automatic transcription, translation, and speech synthesis, you end up accumulating too many errors.”