From New Scientist
AI voices are hard to spot even if you know audio might be a deepfake
By Jeremy Hsu
Even when people know they may be listening to AI-generated speech, it is still difficult for both English and Mandarin speakers to reliably detect a deepfake voice. That means billions of people who understand the world’s most spoken languages are potentially at risk when exposed to deepfake scams or misinformation.
Kimberly Mai at University College London and her colleagues challenged more than 500 people to identify speech deepfakes among multiple audio clips. Some clips contained the authentic voice of a female speaker reading generic sentences in either English or Mandarin, while others were deepfakes created by generative AIs trained on female voices...
The study also did not challenge listeners to identify whether or not the deepfakes sound like the target person being mimicked, says Hany Farid at the University of California, Berkeley. Identifying the authentic voice of specific speakers is important in real-life scenarios: scammers have cloned the voices of business leaders to trick employees into transferring money, and misinformation campaigns have uploaded deepfakes of well-known politicians to social media networks...
Hany Farid is a professor in the Department of Electrical Engineering & Computer Sciences and the School of Information at UC Berkeley. He specializes in digital forensics.