Chris Martin - Distributed large language models

Petals is an open source project that allows you to run large language models on standard consumer hardware using distributed computing “BitTorrent-style”. From the GitHub repository:

Petals runs large language models like BLOOM-176B collaboratively — you load a small part of the model, then team up with people serving the other parts to run inference or fine-tuning.

In the past I have written about how locally run, open source, large language models will open the door to exciting new projects. This seems like an interesting alternative while we wait for optimizations that would make running these models fully on-device less resource intensive.