date
Dec 12, 2024
type
Page
status
Published
slug
summary
Image translation with Controlnet
tags
Tutorial
Teaching
ComfyUI
UCL
category
Knowledge
icon
password
URL

1.Principle of ControlNet

💡
⚠️
Before starting this chapter, please download the following models and place the model files in the corresponding folders:
If you want to use the workflow from this chapter, you can either download and use the Comflowy local version or sign up and use the Comflowy cloud version(opens in a new tab), both of which have the chapter's workflow built-in. Additionally, if you're using the cloud version, you can directly use our built-in models without needing to download anything.
When using Stable Diffusion, you may wish to control the construction of the graph, but adjusting via prompts may not yield great results. This chapter will teach you several common methods of using ControlNet to control the graph construction.

Principle Introduction

Essentially, all the methods I teach in the advanced tutorial are image-to-image methods. They all provide different information to the model through images, so the model can generate the images we want. ControlNet controls the images that the model generates based on the structural information of the image. This structural information could be a sketch, a mask of an image, or even the edge information of an image. All these information can be used to control the generation of images by the model through ControlNet. You can choose different ControlNet to control the image generated by the model according to your needs.
As usual, to help you better understand how to use ControlNet, we will first introduce the principle of ControlNet visually:
notion image
From the above picture, we can see that when we use ControlNet, we first input the text prompt and image into the ControlNet model. Then, the ControlNet model generates a latent image. The latent image will be used as Conditioning and the initial prompt to input into the Stable Diffusion model, thus affecting the image generated by the model.

Scribble ControlNet Workflow

Through the introduction of the principle, you should be able to deduce how to use ControlNet in ComfyUI. We might as well try how to build a simple ControlNet workflow - control with a simple sketch. The effect is roughly as follows:
notion image
With ControlNet, the image output of the model will construct the image according to the sketches you draw. As you can see from the left sketch, it's a very rough cartoon character, while the generated image on the right matches the basic composition and character features (two big ears) of the sketch on the left.
OK, now that you have an impression of Scribble ControlNet, let's see how to construct this workflow together. You can try to construct it on your own first. This way, it can deepen your impression.
HintSolution
💡
In the LoRA chapter, I compared LoRA to a filter. For ControlNet, I believe it's more like a visual prompt supplement, which visualizes the prompts that are hard to describe with text, and also helps the model understand better. It thus solves the problem that the CLIP model's understanding of grammar is quite poor. If you understand it this way, you should be better able to reason and remember how to connect the lines. LoRA is a filter, affecting the model, so it's connected to the Model. And ControlNet is a supplement to the prompt, controlling Conditioning, so it's connected to the Prompt node.

Pose ControlNet Workflow

Once you can build a ControlNet workflow, you can freely switch between different models according to your needs.
The previous example used a sketch as an input, this time we try inputting a character's pose. The advantage of this is that you can use it to control the pose of the character generated by the model. Like this:
notion image
However, note that we can't directly input the image into the ControlNet model like in the previous example, but need to first convert the image into a pose, and then input it into the ControlNet model. But you can also use other tools to make a skeleton diagram, and then directly input it into the ControlNet model.
notion image
So the construction of the entire workflow is the same as the previous workflow, only in the Load ControlNet Model node, we need to load the ControlNet Openpose model, and load the skeleton diagram:
notion image

Depth ControlNet Workflow

The third use of ControlNet is to control the generated images through depth maps. The advantage of this method is that you can control the depth of field of the generated images through depth maps. For example, like this:
This workflow is also similar to the previous one. You can directly import the depth map, or use a plugin to generate the depth map, and then input it into the ControlNet model. Like this:
notion image
This has more depth of field information than pose. For example, my imported image is of two people fighting, standing one in front of the other. If only using pose to draw, it's relatively difficult to draw this kind of front-back image.
If you want to generate depth maps using plugins like in the pose workflow, the method is also straightforward. Simply replace the DWPose Estimation node in the above workflow with the Zoe-Depth Map node.
Moreover, there's another method. You can also use some 3D tools to generate character poses or depth maps. For example, Posemy.art(opens in a new tab) is such a product. You can choose the character's pose you want in the upper left corner (marked 1), then adjust the character's pose by dragging the mouse, and finally click the Export button (marked 2) to export the depth map (marked 3), and then import it into ComfyUI:
notion image

Canny ControlNet workflow

The fourth use of ControlNet is to control the images generated by the model through Canny edge maps. The advantage of this method is that you can control the edges of the images generated by the model with Canny edge maps, like this:
notion image
The workflow setup is similar to the previous one, just replace the ControlNet model with the Canny model.
notion image
 

2.Implementing Controlnet (Use Canny Controlnet as an example)

Introduction to SD1.5 Canny ControlNet

notion image
Canny ControlNet is one of the most commonly used ControlNet models. It uses the Canny edge detection algorithm to extract edge information from images, then uses this edge information to guide AI image generation.
This tutorial focuses on using the Canny ControlNet model with SD1.5

Key Features of Canny ControlNet

  • Structure Preservation: Effectively maintains the basic structure and outlines of the original image
  • High Flexibility: Control guidance strength through edge detection parameter adjustments
  • Wide Application: Suitable for sketches, line art, architectural designs, and various other scenarios
  • Stable Results: Provides more stable and predictable guidance compared to other ControlNet models

Preparation for This Tutorial

1. Update ComfyUI and Install Required Models

Since some nodes use new ComfyUI nodes, you need to update ComfyUI to the latest version first
First, you need to install the following models:
Model Type
Model File
Download Link
SD1.5 Base Model
dreamshaper_8.safetensors
Canny ControlNet Model
control_v11p_sd15_canny.pth
VAE Model (Optional)
vae-ft-mse-840000-ema-pruned.safetensors

2. Model File Placement

Please place the model files according to the following structure:

3. Download SD1.5 Canny ControlNet Workflow File

SD1.5 Canny ControlNet Workflow
👇Right click bellow link with “save link as” to download the workflow and import into your comfyUI.
 
notion image

Workflow Overview

This workflow consists of the following main parts:
  1. Model Loading: Loading SD model, VAE model and ControlNet model
  1. Prompt Encoding: Processing positive and negative prompts
  1. Image Processing: Including image loading and Canny edge detection
  1. ControlNet Control: Applying edge information to the generation process
  1. Sampling and Saving: Generating and saving the final image

Key Nodes Explanation

  1. LoadImage: Used to load input images
  1. Canny: Performs edge detection with two important parameters:
      • low_threshold: Lower threshold, controls edge detection sensitivity
      • high_threshold: Upper threshold, controls edge continuity
  1. ControlNetLoader: Loads the ControlNet model
  1. ControlNetApplyAdvanced: Controls how ControlNet is applied, with parameters including:
      • strength: Control intensity
      • start_percent: When the influence begins
      • end_percent: When the influence ends

Usage Steps

  1. Import Workflow
      • Download the workflow file from this tutorial
      • Click “Load” in ComfyUI, or drag and drop the downloaded JSON file into ComfyUI
  1. Prepare Input Image
      • Prepare an image you want to process
      • Load the image using the LoadImage node
  1. Adjust Canny Parameters
      • Recommended low_threshold range: 0.2-0.5
      • Recommended high_threshold range: 0.5-0.8
      • Preview edge detection results using the PreviewImage node
  1. Set Generation Parameters
      • In the KSampler node:
        • steps: Recommended 20-30
        • cfg: Recommended 7-8
        • sampler_name: Recommended “dpmpp_2m”
        • scheduler: Recommended “karras”
  1. Adjust ControlNet Strength
      • strength: 1.0 means fully following edge information
      • Reduce strength value as needed to weaken control

Tips and Recommendations

  1. Edge Detection Parameter Adjustment
      • If too many edges: Increase threshold values
      • If too few edges: Decrease threshold values
      • Preview effects through PreviewImage node first
  1. Prompt Writing
      • Positive prompts should detail desired style and details
      • Negative prompts should include elements to avoid
      • Prompts should relate to original image content
  1. Common Issues Solutions
      • If generated image is too blurry: Increase cfg value
      • If edge following is insufficient: Increase strength value
      • If lacking details: Increase steps value

Practical Examples

Here are some common use cases and their parameter settings:
  1. Line Art Coloring
      • low_threshold: 0.2
      • high_threshold: 0.5
      • strength: 1.0
      • steps: 25
  1. Structure Redrawing
      • low_threshold: 0.4
      • high_threshold: 0.7
      • strength: 0.8
      • steps: 30

Related Resources

 
 

3. Implementing multiple controlnet

You could use multiple Controlnet in the same time to achieve better controlling. You may Need to install this extension at first.
notion image
Then you can try this workflow, just stack different controlnet model together.
notion image
You could use this mushroom image as an example.
notion image
Check the example (parameters and outputs) inside the PDF.
  • Positive prompt: Architectural visualization of exterior view futuristic half translucent biological pavilion on sunset, 8K, HD
  • Negative prompt: low qaulity, blur
 
 
Loading...