Facebook has released another open source model as they work to commodify generative AI.
We introduce MusicGen, a single Language Model (LM) that operates over several streams of compressed discrete music representation… we demonstrate how MusicGen can generate high-quality samples, while being conditioned on textual description or melodic features, allowing better controls over the generated output.
I was amazed by Google’s MusicLM model earlier this year. Facebook provides side-by-side comparisons here that demonstrate MusicGen is clearly superior. It isn’t an enormous leap, but audio generated using Google’s model has a distinct “compressed” quality that is greatly diminished in Facebook’s implementation.
More importantly, MusicGen is completely open. Google only recently allowed beta testing of MusicLM through their AI Test Kitchen App and, even so, generations are limited to 20 seconds. Here, Facebook released both their code and model weights on GitHub and spun up a Colab notebook demo.