What is CFG Scale?

CFG Scale is prompt weight! If you have any other AI questions, let me know if the comments!

Just kidding, but thinking of it like “prompt weight” is essentially correct. Digging a little deeper, it’s the difference between the influence of the text prompt on noise compared to a random image generated from the same noise.

Banana CFG 1.1
Banana CFG 10.0

The CFG Scale Process

When creating an image, Stable Diffusion begins with a noisy image and applies a de-noising algorithm, trying to generate the given text prompt. At the same time, the algorithm works to produce a random image by taking the “path of least resistance”, attempting to create an image from the noise. So at any one step, Stable Diffusion has generated something closer to your prompt, lets say a banana. As well, if the noise appears to be shaped like an apple and has some red color, the algorithm may work to create an apple image, even if the text prompt is about bananas.

It does both the text prompt de-noising and random de-noising at the same time at each step. Then looks at where each process took the image, the CFG scale is the difference between the two images. Roughly, at CFG 1, it’s giving you the result from the random de-noise, and at CFG 10 it’s giving you the result from the prompt guided de-noise. You can now choose a spot between these two points to find an image that is both similar to what the text prompt would create and what the noise would create.

Banana CFG 7.0

Why Create Random Noise Images?

The way Stable Diffusion is trained, is it will correlate ideas with shapes, texture and colour so that when called to generate an image, it will de-noise blobs of noise that look similar to what the subject would look like if it got turned to noise. But when first starting to create the image, it starts at pure noise, which may place these familiar blobs in weird places, like the stem of the banana being in the corner of the image, as well as the middle and the peel seems to snake around the middle. Although it’s creating all the proper shapes, textures and colours of a banana, it’s trying too hard to force these features to the shape of the noise and will produces wobbly and weird looking results.

Banana CFG 1.1 Seed 3343
Banana CFG 1.1 Seed 3345
Banana CFG 1.1 Seed 3347

De-noising random noise will start to shape the image into something with a composition of a trained image, whatever image is most similar to the noise, although unrelated to the text prompt.

Now we know what the noise will look like as a banana and as something random, drawing a line between these two points, we have a scale from not a banana, to a banana and going further, to a super banana. I lied a little earlier in saying that CFG 10 would give the result of just the pure text-to-image. That’s because the random noise image could be very similar or verry different from the text-to-image, but since we need to apply this to every image regardless of the actual distance, we need to normalize the result to a scale.

With this normalization done, we now can think of CFG as a line from the noise to our intention and beyond. Finding the CFG scale that will produce the best image will depend on the prompt and on the initial noise, but generally you will want to aim for something around 7 for a happy medium.

With higher CFG values, the parts of the image that Stable Diffusion believes to representative of the prompt begin to be overrepresented. This may visually look like the image is darker around edges, where usually the edge would be generally darker, it is now very dark. And texture that may have been subtle before, is now prominent.

Here is a GIF showing the CFG scale progress on the same seed from 1.1 to 10.0.

Prompt: incredibly detailed Banana

Put GFG Scale To Use!

Learn how you can start generating your own incredible banana images with the free and open source application MitchJourn-E. MitchJourn-E is a powerful AI image generation app that uses Invoke-AI as it’s backend to create beautiful images.

Leave a Comment

Your email address will not be published. Required fields are marked *