Listen to the evolution of speech synthesizers from 1939-1985

Originally published at: Listen to the evolution of speech synthesizers from 1939-1985 | Boing Boing

3 Likes

from 1939-1985…

Well did evolution stop in 1985, because that would answer a lot of questions.

2 Likes

mine did.

4 Likes

Here ya go Andrea, just sing & dance around, I’m doing that Right Now!

4 Likes

In the '90s I worked for a speech and hearing research company in Cambridge co-founded by Ken Stevens, who had worked extensively with Dennis Klatt.

One of our products was a PC-based speech synthesizer based on Klatt’s KLSYN88. Programming the synthesizer could be difficult, because it took ~40 time-varying parameters as input and hand-tweaking that much data was a pain. Ken’s big innovation on the synthesizer was a system that reduced the number of parameters to ten, and used the constraints of the vocal tract to produce all of the Klatt parameters from those ten.

We had a demo audio file of Ken speaking the first two lines of The Raven, and a file that used Ken’s high-level system to produce a synthesized version of Ken’s voice saying the same thing. IIRC at some point we mixed the two files up and nobody noticed for months. It was a really good system.

3 Likes

@AndreaJames
For me one thing of interest is that the developers of the various speech engines had skewed the speech to follow an accent! Some seem more British and others different.

Apart from the digital artifact of the low bit rate it is the sliding or glissando between pitches that would seem to be the main problem, was wondering if this was on their radar when creating this magic.

Of course, just at the end of the video’s time frame, speech synthesis was beginning to be available on the desktop.

2 Likes

I still have my ancient RS232 text-to-speech card.

Maybe I’ll fire it up again, but it really was kind of awful. eta: Besides, my Pis do better:

https://elinux.org/RPi_Text_to_Speech_(Speech_Synthesis)

2 Likes

They are emulating the Transatlantic accent!

This is some sort of malarkey of which you speak, does it cut the mustard?!

With regard to speech synthases and digital speech recognition/production in general this topic highlights a certain ‘Western’ bias as the ‘baseline’… I’m thinking of languages that rely on tonal and pitch inflections for example Vietnamese.

Just now wondering of the impact of diverse cultures and languages on modern tech and vice versa?

2 Likes

This topic was automatically closed after 5 days. New replies are no longer allowed.