Eleven Labs recently shared a demo of their new voice synthesis AI. It is worth listening to the audio samples. While I don’t think they are significantly better than the recent demo released by Apple, it is for precisely that reason that I think this is so noteworthy — the fact that small companies are able to build comparable offerings to the industry’s largest players is impressive.
Also, I have to admit, their Steve Jobs voice simulation demo is impressive.
Finally, as time goes on I am increasingly unable to understand why none of these recent advancements have trickled down into voice assistants. Why not hook up a speech recognition AI to GPT and then speak the result using one of these voice generation AIs? It must be inference cost, right? Otherwise, I must be missing something.
Microsoft and OpenAI together could use Whisper, to ChatGPT, to VALL-E and dub it Cortana 2.0. Or put it in a smart speaker and instantly blow Amazon Alexa, Apple Homepod, Google Home offerings out of the water. And that is just using projects OpenAI and Microsoft released publicly!