STABLE DIFFUSION CFG SCALE 1 IGNORES NEGATIVE
Have you ever meticulously crafted a negative prompt in Stable Diffusion, only to feel like it's being completely ignored? (I tested on the stable diffusion-2_768 model and on others, same result, sampler used was DDIM) - Only appears to be an issue when using the DDIM or PLMS sampler, all others seem to work fine with a huge negative prompt list. Steps to reproduce the problemYou're not alone. CFG guidance scale. This parameter can be seen as the Creativity vs. Prompt scale. Lower numbers give the AI more freedom to be creative, while higher numbers force it to stick more to the prompt. The default CFG used on OpenArt is 7, which gives the best balance between creativity and generating what you want.The CFG scale, or Classifier-Free Guidance scale, is a crucial setting in nearly every Stable Diffusion AI image generator, dictating how closely the AI adheres to your prompts during the image generation process.While a higher CFG scale generally forces the AI to stick more rigidly to *both* positive and negative prompts, sometimes things don't go as planned.This can be frustrating, especially when you're trying to fine-tune your image and remove unwanted elements. One as usual and other with the negative prompt wrapped in [negative prompt::0.95], which should make the negative prompt empty for the last step. The model used is Meina Alter v2. Positive prompt is `masterpiece, best quality, intricate details, 1girl, elegant clothes, happy`Understanding the nuances of the CFG scale and its interaction with negative prompts is essential to unlocking the full potential of Stable Diffusion. 实验数据表明:合理使用反向提示词可使图像合格率提升63%,配合CFG Scale=7-9时效果最佳。 文章来源于互联网: Stable Diffusion 反向提示词(Negative Prompt)深度解析We'll explore what CFG scale actually *is*, how it interacts with positive and negative prompts, and, most importantly, why your negative prompts might seem to be ignored and how to fix it.From sampler choice to CFG scale values to hidden settings, we’ll cover everything you need to know to troubleshoot and optimize your Stable Diffusion workflow.
What is the CFG Scale in Stable Diffusion?
The CFG scale, short for Classifier-Free Guidance scale, is a parameter that controls how much the AI generation process should be guided by your prompt.Think of it as a dial that adjusts the ""strictness"" with which the AI follows your instructions. Greetings everyone! I've copied the script from this article and then added CFG, Sample Steps, and Negative Prompt features to it. Also installed the ftfy module and replaced torch with this one (due to popular CUDA assertion error):A lower CFG scale gives the AI more creative freedom, allowing it to deviate from your prompt and introduce unexpected elements.Conversely, a higher CFG scale forces the AI to adhere more closely to your prompt, resulting in an image that more accurately reflects your desired outcome. Stable Diffusionでイラスト生成する際には、いろんなパラメーターがありますが、今回はそのなかの一つであるCFG scaleについて説明します。 CFG scaleを変更することにより、かなりイラストの印象が変わるので、仕組みを知って使いこなせるようになると便利です。The sweet spot usually lies somewhere in the middle, balancing creative interpretation with faithful reproduction.
OpenArt, for example, uses a default CFG scale of 7, which they find provides a good balance between creativity and prompt adherence. Best Settings for SDXL 1.0: Guidance, Schedulers, and Steps. To harness the full potential of SDXL 1.0, it's crucial to understand its optimal settings: Guidance Scale. Understanding Classifier-Free Diffusion Guidance. Diffusion models are powerful tools for generating samples, but controlling their quality and diversity can be challenging.Experimentation is key, as the ideal CFG scale can vary depending on the model, prompt, and desired style. We would like to show you a description here but the site won t allow us.Many Stable Diffusion UIs will restrict the CFG scale to a positive number between 1 and 30.Setting the CFG scale too high, while possible in some interfaces, can lead to oversaturation and other undesirable effects.
The Role of Positive and Negative Prompts
Stable Diffusion uses both positive and negative prompts to guide the image generation process. CFG Scale 2 works well. 4 is getting a tad wonky, and more than that is bad. (5, 2) In general, it seems to ignore a lot of prompt tags and just does whatever it wants to do. Beautiful but unconfigurable. Very realistic humans. Totally just ignores half my prompt. Wants higher resolutions. No VAE (none I have so far work).The positive prompt describes what you *want* to see in the image, while the negative prompt specifies what you *don't* want.The negative prompt is a powerful tool for refining your images and removing unwanted artifacts, styles, or elements.
Here's how it works:
- Positive Prompt: Tells the AI what to create. 在使用Stable Diffusion web UI、ComfyUI等进行生图的时候, 提示词引导系数 (CFG Scale) 是常用设置参数之一,那么你了解过CFG Scale是什么吗?今天就代大家了解一下CFG Scale,让大家以后在SD生图的时候更容易设置该参数。 提示词引导系数 (CFG Scale)有什么作用?For example: ""masterpiece, best quality, intricate details, 1girl, elegant clothes, happy""
- Negative Prompt: Tells the AI what to *avoid*. The Classifier-Free Guidance (CFG) scale controls how closely a prompt should be followed during sampling in Stable Diffusion. It is a setting available in nearly all Stable Diffusion AI image generators. This post will teach you everything about the CFG scale in Stable Diffusion.For example: ""zoomed in, blurry, oversaturated, warped""
By combining positive and negative prompts, you can significantly improve the quality and accuracy of your generated images.The AI essentially tries to maximize the presence of elements described in the positive prompt while minimizing the elements listed in the negative prompt.
Why Your Negative Prompt Might Be Ignored
So, what happens when your meticulously crafted negative prompt seems to have no effect?There are several potential reasons why Stable Diffusion might be ignoring your negative prompt, especially in relation to the CFG scale:
- Low CFG Scale: At very low CFG scales (close to 1), the AI has more freedom to deviate from *both* the positive and negative prompts.This means the negative prompt's influence is minimized, and the AI is more likely to introduce unwanted elements.
- Sampler Issues: Certain samplers, such as DDIM or PLMS, have been reported to exhibit issues with negative prompts, particularly when long or complex negative prompts are used. If you are using an image generation architecture that doesn't support distilled CFG you can ignore that part of the configuration below. Configuration 1: Distilled CFG = 4, Main CFG = 6This means that your meticulously crafted negative prompts may simply be ignored by the AI.
- Clip Skip Issues: The clip skip setting influences which layers of the CLIP model are used during the process.When set to values beyond 1 or 2, it can sometimes interfere with the proper processing of negative prompts.
- Prompt Formatting: Incorrectly formatted prompts, such as missing commas or typos, can confuse the AI and prevent it from properly interpreting your negative prompt.
- Vague or General Negative Prompts: Using overly general negative prompts like ""bad quality"" or ""ugly"" may not be specific enough for the AI to understand what you want to avoid.
- Model limitations: Some models are simply less responsive to negative prompts than others.This is due to how the models were trained.
The Math Behind It: Understanding the Formula
To understand why this happens, let's look at the formula that underpins how CFG scale influences prompts.The simplified formula is this:
model(neg) + CFG_scale * (model(pos) - model(neg))
Where:
- model(pos)is the model's output based on the positive prompt.
- model(neg)is the model's output based on the negative prompt (or the unconditioned output if no negative prompt is used).
When a negative prompt is empty, the formula applies an offset of length 'x' * CFG_scale. Happened to talk about clip skip (or Ignore last layers of CLIP model ) (specifically when set beyond 1 and 2) in a thread and wanted to look at it a bit more.When a negative prompt *is* used, the offset is 2 * 'x' * CFG_scale. This is where the CFG scale comes from, lower values subtract out less of the unconditioned output, giving more control to the model rather than the prompt effectively. However after reading this and also reading the negative prompt implementation for auto1111, you seem to be right. Rather than the unconditioned being a null prompt, it'sThis difference affects the strength of the influence of each prompt.
Troubleshooting and Optimizing Your CFG Scale and Negative Prompts
If you're experiencing issues with your negative prompts being ignored, here are some steps you can take to troubleshoot and optimize your workflow:
- Increase the CFG Scale: Try increasing the CFG scale to a value between 7 and 12. Negative prompt: Use keywords with commas separating the keywords to have images avoid your description. The CFG scale in stable diffusion tells the software how closely you want it to follow the prompt.This will generally increase the influence of both your positive and negative prompts, leading to a more accurate representation of your desired image.
- Use Specific Negative Prompts: Replace vague terms like ""bad quality"" with specific issues you're seeing in your images.For example, instead of ""ugly,"" use ""deformed face,"" ""distorted anatomy,"" or ""extra limbs.""
- Experiment with Different Samplers: If you're using DDIM or PLMS, try switching to a different sampler like Euler a or DPM++ 2M Karras.These samplers are generally more reliable when it comes to handling negative prompts.
- Check Clip Skip Settings: If you're using clip skip, try setting it to 1 or 2 to see if it resolves the issue.
- Review Prompt Formatting: Ensure that your prompts are correctly formatted with commas separating keywords.Double-check for typos or other errors that could confuse the AI.
- Test with Different Models: Try using different Stable Diffusion models to see if some are more responsive to negative prompts than others.
- Use a Negative Embedding: Consider using a negative embedding like ""EasyNegative"" to help remove common unwanted elements. guidance_scale = 8 @param num_inference_steps = 30 @param prompt = Beautiful picture of a wave breaking @param negative_prompt = zoomed in, blurry, oversaturated, warped @param Encode the prompt text_embeddings = pipe._encode_prompt(prompt, device, 1, True, negative_prompt) Create our random starting point latents = torch.randn((1These embeddings are trained to suppress common artifacts and can be a useful addition to your negative prompt.
- Refine Your Positive Prompt: Sometimes, the issue isn't with the negative prompt, but with the positive prompt itself.Try adding more detail or specificity to your positive prompt to guide the AI in the right direction.
Practical Examples and Scenarios
Let's illustrate these concepts with a few practical examples:
Scenario 1: Removing Unwanted Text
Suppose you're generating an image of a landscape, but the AI keeps adding unwanted text. The formula that involves prompts and CFG scale is just a simple linear extrapolation: model(neg) cfg_scale (model(pos) - model(neg)) When negative prompt is empty, you apply offset of length x cfg_scale. When it's not empty, the offset is 2 x cfg_scale because it uses variables in opposite edges of hypersphere instead of edge minusYour positive prompt might be: ""beautiful landscape, mountains, trees, sunset."" Your initial negative prompt might be: ""text, watermark.""
If the text is still appearing, try these adjustments:
- Increase CFG Scale: Increase the CFG scale from 7 to 10.
- Specific Negative Prompts: Replace ""text, watermark"" with ""text, watermark, signature, artist name.""
Scenario 2: Fixing Distorted Anatomy
You're trying to generate a portrait of a person, but the anatomy is distorted.Your positive prompt is: ""portrait of a woman, detailed face, realistic."" Your negative prompt is: ""deformed, bad anatomy.""
If the distortions persist:
- Increase CFG Scale: Increase the CFG scale from 7 to 12.
- Specific Negative Prompts: Add more specific terms to your negative prompt: ""deformed, bad anatomy, extra limbs, missing fingers, disfigured face.""
- Try a Different Sampler: If you're using DDIM, switch to Euler a.
Scenario 3: Removing Oversaturation
Your images are consistently oversaturated. TLDR In this video, Jen explores techniques to enhance results with Stable Diffusion, an open-source AI for text-to-image generation. She explains the use of positive and negative prompts to refine image outcomes, delves into the importance of the sampling step slider and sampler method choices, and introduces the CFG Scale slider for controlling image adherence to prompts.Your positive prompt is: ""vibrant colors, fantasy landscape."" Your negative prompt is: ""oversaturated.""
To address this:
- Adjust Distilled CFG Scale: If your UI supports it, experiment with the Distilled CFG Scale.Sometimes adjusting this can be more effective than the main CFG scale for texture issues.
- Specific Negative Prompts: Add ""vibrant colors"" to the negative prompt. guidance_scale:控制 CFG 的强度,影响生成图像与文本提示的相关性。通过设置 guidance_scale 1 启用 CFG。 guidance_scale 越高、生成的图像与文本 提示 的相关性越高,但通常图像质量会有所下降。 do_classifier_free_guidance:布尔参数,用于启用或禁用 CFG。当启用时Yes, that sounds counterintuitive, but it can help to balance the effect.Also add: ""oversaturated, bright, neon.""
Common Questions About CFG Scale and Negative Prompts
What is the ideal CFG scale value?
There is no single ""ideal"" CFG scale value.The best value depends on the model, prompt, and desired style. In this case, the prompt contains more words ().Some concepts from the prompt are more visible on the images with increased guidance. Notice, for example, how the suit has more details with a guidance scale of 17, emphasizing intricate inflatable shapes and some biopunk elements.A good starting point is between 7 and 12, but experimentation is key.
Does a higher CFG scale always produce better results?
Not necessarily.While a higher CFG scale can increase prompt adherence, it can also lead to oversaturation, artifacts, and a loss of creativity. The CFG scale in Stable Diffusion is a parameter for the user to control the 'strictness' of the AI's execution of prompt. The larger CFG scale you enter, the more you want the AI to follow your prompt.It's important to find a balance that works for your specific use case.
Are negative prompts always necessary?
No, negative prompts are not always necessary, but they can be extremely helpful for refining your images and removing unwanted elements.They are particularly useful when you're struggling to achieve a specific look or when you're encountering persistent artifacts or distortions.
Can negative prompts completely override the positive prompt?
No, negative prompts cannot completely override the positive prompt.The AI will still attempt to fulfill the instructions in the positive prompt, but the negative prompt will guide it to avoid certain elements or styles.
Conclusion: Mastering the CFG Scale for Optimal Results
The CFG scale is a powerful tool for controlling the creative process in Stable Diffusion.Understanding how it interacts with positive and negative prompts is essential for achieving the desired results.While it can be frustrating when negative prompts seem to be ignored, by systematically troubleshooting and experimenting with different settings, you can unlock the full potential of Stable Diffusion and generate stunning, high-quality images.
Key takeaways:
- The CFG scale controls how closely the AI follows your prompts.
- Negative prompts specify what you *don't* want in your image.
- Low CFG scales can minimize the influence of negative prompts.
- Certain samplers and clip skip settings can interfere with negative prompts.
- Specific negative prompts are more effective than vague ones.
Experiment with different CFG scale values, samplers, and negative prompt strategies to find what works best for your specific models and artistic goals. In an attempt to understand CFG scale and how it influences prompts, I ran some tests. I noticed something that maybe someone more knowledgeable can explain. in this set, CFG scale goes from 1 to 20 (I did not post every image). the higher it goes, the closer to the prompt it gets.Happy generating!
Comments