Thieves used audio deepfake technology to clone a businessman’s voice and order a $35 million transfer to foreign accounts, according to a court document obtained by Forbes. It’s the most successful “deep voice” heist so far, though it may be just a small part of a growing trend.
Deepfake technology is fairly well-known at this point. Basically, people train an AI to recreate someone’s face, usually the face of an actor or other well-known individual. The AI can then animate and paste this face on a reference video, thereby inserting the cloned subject into a scene.
But you can’t just stick someone in a video without recreating their voice. And that’s where audio deepfakes come into play—you train an AI to replicate someone’s voice, then tell the AI what to say in that person’s voice.
Once deepfake technology reaches a certain level of realism, experts believe that it will drive a new era of misinformation, harassment, and crappy movie reboots. But it seems that “deep voice” tech has already reached the big time.
Back in 2020, a bank manager in the U.A.E. received a phone call from the director of a large company. A big acquisition was in the works, according to the director, so he needed the bank to authorize $35 million in transfers to several U.S. accounts. The director pointed to emails from a lawyer to confirm the transfer, and since everything looked legit, the bank manager put it through.
But the “director” of this company was actually a “deep voice” algorithm trained to sound like its victim. The U.A.E. is now seeking U.S. assistance in retrieving the lost funds, which were smuggled to accounts around the globe by a party of 17 or more thieves.
This is not the first audio deepfake heist, but again, it’s the most successful so far. Similar operations will occur in the future, likely on a much larger scale. So what can businesses and governments do to mitigate the threat? Well, it’s hard to say.
Because deepfakes are constantly improving, they’ll eventually become too convincing for humans to properly identify. But trained AI may be able to spot deepfakes, as cloned faces and voices often contain small artifacts and mistakes, such as digital noise or small sounds that are impossible for humans to make.