Eryk Salvaggio:

Diffusion models all start in the same place: a single frame of random, spontaneously generated Gaussian noise. The model creates images by trying to work backward from the noise to arrive at an image described by the prompt. So what happens if your prompt is just “Gaussian noise?”

[…]

In theory, the machine would simultaneously aim to reduce and introduce noise to the image. This is like a synthetic paper jam: remove noise in order to generate “patterns” of noise; refine that noise; then remove noise to generate “patterns” of noise; etc. Recursion… In simple terms: The model would have a picture of Gaussian noise in front of it. And it would look at it and say: “OK, I have to remove this Gaussian noise until I get to Gaussian noise.”