Chris Martin - Open source large language models

Stanislas Polu has some predictions about the state of large language models next year. The only prediction of his that I feel especially confident in myself is the following:

There will be an opensource Chinchilla-style LLM released this year at the level of text-davinci-*. Maybe not from the ones we expect🤔This will obliterate ChatGPT usage and enable various types of fine-tuning / soft-prompting and cost/speed improvements.

I would say that now, especially after the success of ChatGPT, an equivalent open source LLM will almost certainly be released in the next year. This will likely follow the same pattern as image generation AIs earlier this year: first, OpenAI released DALL-E 2 as a private beta. Then, a little while later, Stable Diffusion was released which, although it wasn’t quite as good as DALL-E, was free, open-source, and widely accessible. This allowed for an explosion of creative applications including photoshop plugins, 3D modeling plugins, and easy to install native frontend interfaces.

While I believe text generation AIs will have a similar moment early next year. The unfortunate truth is that running even a pre-trained text generation network requires significantly more computer memory than is required to run similarly sized image generation networks. This means we will probably not see early open-source text generation networks running natively on consumer hardware such as iPhones like we have with Stable Diffusion (although it is possible Apple will, once again, help with that).

My hope is that these developments in text-generation spur some much-needed innovation in household voice assistants which are increasingly feeling dated.