The first really great “bad” computer voice had to be Joshua from War Games (1983). Before that, actors dubbing voice over spouted nasally, monotone lines (“Danger, Will Robinson!”) to make audiences believe they were programmed, but Joshua had all the right pieces: sometimes monotone, sometimes the slightest of inflections, and an endearing, almost childlike innocence to his chunky, cold-hearted attempts to destroy the world.
But today’s tech fans are a little more discerning, so computers just got a little more human thanks to Google’s efforts to make its AI sound more like a person and less like a machine. The DeepMind project has used a new model called WaveNet to produce voice sounds that are far more natural than previous attempts.
Long ago, programmed voices strung together pre-recorded words in the correct order, and the effect was very choppy and disconnected. After that, programmers began relying on pre-programmed diphthongs, or chunks of words that the computer would pull together. It made for more natural sounding speech, which was such a leaps and bounds improvement for its time.
But now, Google’s “neural network”-based project has “taught” its computer to speak with a more natural, human inflection. By repeatedly training the computer in both English and Mandarin, the developers claim to have reduced the “it’s not human” factor by almost fifty percent, according to survey results from listeners.
Of course, the goal isn’t just to make a better Alexa (or Google Home, per the developers), but to create a humanoid voice that can actually interact with the user. Right now, that capability isn’t here, as the AI team is still working to teach the computer to create its own responses rather than select from a menu of programmed choices. In the meantime, there are potential applications for more human-sounding voices from technology, especially in rehabilitation, assistive technology for disabled individuals, and education.