Perfect impression of a contemporary text-to-speech bot

Originally published at: https://boingboing.net/2019/04/09/perfect-impression-of-a-contem.html

4 Likes

This is how I imagine Stephen Miller talking whenever I see a photo of him and his cold, dead doll eyes.

3 Likes

Text-to-speech tech has moved on from the angry robot talk of Steven Hawking’s voice box, but most of it still lingers in an uncanny place short of natural.

I dunno. That sort of synthesis has been evolving at a very fast pace as of late.

3 Likes

I had a girlfriend that spook like that, uncanny.

1 Like

I don’t. See what’s so. Special? Aboutit.

8 Likes

I don’t really see the way to use this for a prank. Fortunately, I am not alone - the whole internet can put on their thinking caps!

2 Likes

Answering telemarketers phone call? “Hello, I am a real human being. Who is this?”

1 Like

On the other end: everyone I know now talks like a robot when they use speech recognition systems. Humans and robots doing the double-take look back as they cross paths in the uncanny valley.

I had a customer service rep accuse me of being a robot while I was trying to maximize my intelligibility reading a string of numbers.

1 Like

I feel like singing software like this doesn’t fit in the same category since it’s more like an instrument than text-to-speech - the quality of a performance like this depends a lot on the human tuner finessing out the kinks in what the software might otherwise automatically input for things like pitch or phoneme choice or even things like “gender factor” or “brightness”

The automatic results have definitely been getting better, but also I’d trust Jirai to be able to tune a piece of pocket lint.

1 Like

Now I just want to hear this read by Matt Berry as Steven Toast.

It seems to me that TTS systems have gone about as far as possible, without understanding the underlying semantics. The part that they still can’t do is the conveying of nuance with intonation. Each individual word may be pronounced perfectly, but the way in which the tones are mixed and elided as those words come together to make a sentence-- well, that’s very complicated, and highly dependent on semantic context. Everyone does it automatically, without even thinking about it.

(Although these algorithms do already get into semantics for some things. Mainly homograph disambiguation: is a “dr.” pronounced doctor or drive? She lives / many lives. That kind of thing.)

Rob, I hope you know I respect you and like your style. But I am a bit sad about your phrasing right there.

Dennis Klatt’s work was groundbreaking, and a part of Hawking’s identity. I feel sorry for the guy if you talk about his creation as an “angry robot voice”.

Link for those who haven’t heard about it:

ETA:

3 Likes

This topic was automatically closed after 5 days. New replies are no longer allowed.