This AI-Generated Joe Rogan Voice Sounds Eerily Like The Real Thing

5 years ago

May 20, 2019 at 11:30 am

This AI-Generated Joe Rogan Voice Sounds Eerily Like The Real Thing

In recent years, eerily accurate deepfake videos have gotten a lot of press, but automated voice replication has been quietly sliding into the uncanny valley as well. Case in point: The AI company Dessa has created a simulation of podcaster Joe Rogan’s voice that is nearly indistinguishable from the real thing.

Listen to it in this video that Dessa released last week. According to Dessa, the voice comes from a machine learning model, and all the words come from text input.

Sure, Robo-Rogan doesn’t sound quite as relaxed as the real thing is when he’s stoned and on a roll with a guest. It sounds a bit like the slightly stilted voice he might use if he were reading an ad. But it’s undeniably Rogan’s “voice.”

It’s especially hard to distinguish whether or not the voice is real when only heard in short snippets. To prove this, Dessa released a quiz — which, personally, I got a failing grade on. I’ve heard a lot of the his voice over the years, and I had a difficult time telling the difference between Joe Rogan and Joe Fauxgan.

As The Verge pointed out, Dessa obviously had a lot of material to work with. Rogan just released episode 1,299 of his podcast, and most of these episodes are two to three hours. So Dessa could easily access thousands of hours of Rogan’s voice to use for AI training.

The Dessa blog post announcing its speech synthesis model dives into the societal implications of this technology, because “in the next few years (or even sooner), we’ll see the technology advance to the point where only a few seconds of audio are needed to create a life-like replica of anyone’s voice on the planet,” according to Dessa. “It’s pretty f*cking scary.”

The post lays out a few examples of nefarious ways the technology could be used, including spam callers impersonating family members, fake voices being used to gain high security clearance, and audio deepfakes of politicians that could cause an uprising or manipulate elections.

Dessa also provides examples of what it sees as good things that could come from this technology, like automated voices that could make voice assistance more natural, improved text-to-speech applications for people with disabilities, and, um, “a workout app that contains a personalised pre-workout pep talk from Arnold Schwarzenegger.”

All those suggested benefits, I must say, don’t seem to outweigh the dystopian possibilities of anyone being able to mimic anyone else’s voice.

Because of these implications, Dessa said it’s not releasing its model to the public. But it’s probably only a matter of time before we’re going to have to worry about someone threatening to send our boss a recording of us talking about peeing in their office if we don’t send the scammer $5,000 in bitcoin.

Great News, You’re Getting More Bluey Soon

Here’s How You Can Save Dollarydoos by Changing NBN Providers Every 6 Months

It’s a Planet of the Apes, and We’re Just Living in It

Watch this Trippy NASA Visualisation Take You Inside a Black Hole

Every Single EV You Can Currently Buy in Australia, and How Much It Will Cost (Including Discounted Models)

Here’s How You Can Save Dollarydoos by Changing NBN Providers Every 6 Months

Today’s Best Australian Tech Deals

An Apple Deal a Day Helps You Save On the iPhone 15, Apple Watch 9, AirPods Pro and More

Here’s Why You Should Ditch Optus, Telstra and Vodafone for One of These Smaller Mobile Providers

Here’s Where You Can Buy a Steam Deck in Australia

This AI-Generated Joe Rogan Voice Sounds Eerily Like The Real Thing