This software can clone a person's voice by listening to a 5-second sample

frauenfelder · November 13, 2019, 6:49pm

Originally published at: https://boingboing.net/2019/11/13/this-software-can-clone-a-pers.html

…

Brainspore · November 13, 2019, 6:55pm

All the better to impersonate a target’s mother on the phone (as in The Terminator) or possibly a target’s foster mother on the phone (as in Terminator 2: Judgement Day).

ddmone · November 13, 2019, 6:58pm

My mind went to Ferris Bueller.

Brainspore · November 13, 2019, 6:58pm

I’m not sure why a Terminator would want to impersonate Ferris Bueller on the phone but I guess that could work too.

anon50609448 · November 13, 2019, 7:01pm

Well, assuming the Terminator came of age during the 1980s, I’m pretty sure everyone wanted to be Ferris Bueller.

Lexicat · November 13, 2019, 7:24pm

I wonder how it works with transgender voices for trans-folks crossing the binary vocally?

anon75430791 · November 13, 2019, 7:28pm

I was discussing this last night with the missus.

With realtime raytracing hardware now, deepfakes and now this voice cloning capability, if you look over the horizon and imagine slick tooling made available to home users, where you end up is very watchable fanfiction of long-dead TV shows.

I would watch fanmade episodes of “Pushing Daisies”.

You look a little past this point, and the lightbulb goes off for the writers of such shows. Low cost reboots with royalties paid to the original actors.

hastur · November 13, 2019, 8:18pm

Tell tale being a slight Austrian tone due to the base voice files.

MononymousSean · November 13, 2019, 9:05pm

Deep fakes just got a LOT deep fakier.

Slant · November 13, 2019, 9:41pm

Shopping list for immediate future (everyday citizen version):

-Guy Fawkes mask
-mirrored clothing
-red laser pointer thingy
-tin foil (double the order)
-hair dye
-multiple VPN subscriptions
-umbrella
-anti-drone netting
-colored contact lenses
-sunglasses
-pi hole
-and a fucking Electrolarynx

Shuck · November 13, 2019, 10:33pm

I’m disappointed that I can’t think of any hacker movies that foresaw this technology, as the immediate use is going to be calling up someone at work to impersonate their boss and get them to compromise the entire organization/transfer millions of dollars.

I’m sure the software don’t care. The training set is “human voices” and it’s not distinguishing gender or age, etc. What it’s learned from that is just how to extrapolate from a small sample, rather than try to reconstruct the voice based on other similar voices. What might throw it off might be unusual speech impediments or medical conditions that make voices inconsistent in usual ways, or give the speaker’s voice a quality that it doesn’t recognize as human.

I’ll be curious to see how the new James Dean movie does…

anon75430791 · November 13, 2019, 10:39pm

There’s probably a lot of pronunciations that will get messed up.

Reproducing the tone of the voice is very different than the accent or dialect.

Shuck · November 13, 2019, 10:46pm

Yeah, it couldn’t not mess up pronunciations - there’s not enough information for it to pronounce words the way the speaker would. It’ll be using pronunciations from whatever general model they’re using, obviously (though presumably that could be modified on a per-voice basis to sound more authentic). But it’ll sound like that person pronouncing something in a way that they wouldn’t normally. I imagine accent is partially captured, though - there’s also not enough information to fully capture it, but it also does impact tone of voice to some degree, so it’s not totally ignored (and again, presumably modifiable on a per-voice basis).

glowend · November 13, 2019, 10:49pm

So it’s possible that the voicemail I got from Obama telling me to vote for Trump was faked?

Bytheway · November 13, 2019, 10:54pm

Was that during the 2007-8 runnings when Trump was a democrat ? If so, maybe.

Lexicat · November 13, 2019, 11:08pm

I thought I saw that it has takes “sex” parameter, but actual voice combines aspects of sex (e.g., register and formants resulting from bimodal distribution of phenotypic sex), and gender (e.g., intonation, word choice, non-verbal cues such as vocal fry, etc.), so I wonder. I might have seen this or something very like this a few days ago, and be transposing the other’s sex parameter (if it was a different tool).

Shuck · November 13, 2019, 11:45pm

I haven’t tried to read the paper, but in the video they just happened to mention the gender of the speakers in some samples they had, but it didn’t seem to be relevant to anything in terms of how the process worked.

Foggerton_Catbury · November 13, 2019, 11:53pm

Aha, good! About time too. I’ve been waiting for ages to get a cold, unfeeling machine to perfectly mimic the sweet way my girlfriend says my name so I can horrify myself right off.

Korath · November 14, 2019, 12:57am

5.5 minute video by “Two Minute Papers”.

anon50609448 · November 14, 2019, 1:20am

I used to have a very good voice changer to play around with and honestly no matter what our brains are telling us about male-sounding and female-sounding voices it’s almost entirely just pitch. I doubt the AI would have to bother distinguishing.

Topic		Replies	Views
This service makes a digital voice that sounds like you from a small audio sample boing	23	1688	October 19, 2019
15 second sample alone needed to make AI voice clone boing	7	382	April 6, 2024
Your voice-to-text speech is recorded and sent to strangers boing	42	4431	March 5, 2015
Google's talking AI is indistinguishable from humans boing	58	3907	April 8, 2018
Sounds like even the Theranos CEO's voice was fake boing	66	3824	March 24, 2019

This software can clone a person's voice by listening to a 5-second sample

Related topics