DeepZen is a voice synthesizer project that utilizes artificial intelligence algorithms from IBM’s Power A.I. and Watson technologies. DeepZen has developed text-to-speech tools that not only sound human at first listen, but can also pick up on the emotional cues needed for reading text in a compelling manner. In doing so, the company claims that it could reduce the time and cost to produce audiobooks by up to 90%.
Taylan Kamis, CEO and Co-founder of DeepZen, explains: “Our aim isn’t to put voice actors out of jobs, but rather to solve the capacity issues in the current market. We identify emotion in text automatically and use voice samples – for which we pay royalties to voice actors – combined with speech synthesis technology to produce convincing voice audio.
“To do this, we needed to create large and complex neural networks. These require extensive amounts of processing power to produce accurate results fast, so we needed the right technology platform to bring our vision to life.”
DeepZen promises it is not going to put narrators out of a job, but their technology will assist smaller publishers and indie authors to create an audiobook, without having to go to the hassle of dealing with professional narrators. DeepZen aims to commercialise its technology through working with its partners as well as its own audiobook marketplace namely Audiowhale.com in the coming months.
Michael Kozlowski is the editor-in-chief at Good e-Reader and has written about audiobooks and e-readers for the past fifteen years. Newspapers and websites such as the CBC, CNET, Engadget, Huffington Post and the New York Times have picked up his articles. He Lives in Vancouver, British Columbia, Canada.