Setting up Stable Diffusion with Hugging Face's Diffusers Library

From ACCRE Wiki

Accessing the ACCRE Visualization Portal

To start, you'll need to access the ACCRE Visualization Portal and launch a Jupyter Notebook session with GPU support. Follow these steps:

1. Go to the ACCRE Visualization Portal.
2. Click on ACCRE Visualization Portal.
3. Enter your VUID and password on the login page.

Login Page
Login Page


4. Once logged in, go to the top menu and click on Interactive Apps.
5. Select Jupyter Notebook (GPU).
6. For the 'GPU Enabled Slurm Account', type accre_workshops_acc.
7. For the 'Number of hours', type 1.
8. For the GPU architecture, choose A6000.
9. For the Python version, choose Python 3.9.12/Anaconda 2022.05.
10. Click Launch.
11. On the next page, click Connect to Jupyter.

Jupyter Notebook Session
Jupyter Notebook Session


12. In the Jupyter interface, click on the New tab and select Terminal where we will do our package installations.

Step 1: Install Necessary Packages

First, you need to install the required packages. Run the following commands:

# Copy this lines to install PyTorch with CUDA support in the terminal
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

# Install diffusers, invisible_watermark, transformers, accelerate, and safetensors
pip install diffusers invisible_watermark transformers accelerate safetensors


13. In the Jupyter interface, click on the New tab and select Python 3.

Now you are ready to follow the guide below to set up Stable Diffusion.

Setting up Stable Diffusion with Hugging Face's Diffusers Library

This guide will walk you through the process of generating and refining images using Hugging Face's Diffusers library. Follow the steps below to install necessary packages, set up directories, load models, generate images, and refine them.

Stable Diffusion Images

Step 2: Import Required Libraries

Import the necessary libraries to use the DiffusionPipeline and other modules.

from diffusers import DiffusionPipeline
import torch
import os

Step 3: Set Up Directories

Create an output directory to save the generated images.

# Set up directories
!mkdir /home/<YOURUSERNAME>/saved_img
output_dir = "/home/<YOURUSERNAME>/saved_img/"
os.makedirs(output_dir, exist_ok=True)

Step 4: Load the Base Model

Load the base model from Hugging Face's repository.

# Load the base model
pipe = DiffusionPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0", 
    torch_dtype=torch.float16, 
    use_safetensors=True, 
    variant="fp16"
)
pipe.to("cuda")

Step 5: Generate an Image with the Base Model

Generate an image using the base model with a given prompt.

# Generate an image with the base model
prompt = "An astronaut riding a green horse"
image = pipe(prompt=prompt).images[0]

# Save the generated image
output_path = os.path.join(output_dir, "output_image.png")
image.save(output_path)

Step 6: Load the Base and Refiner Models

Load the base and refiner models for further refinement.

# Load the base model with specific settings
base = DiffusionPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",  # The pre-trained base model
    torch_dtype=torch.float16,                   # Use 16-bit floating point precision for tensors
    variant="fp16",                              # Specify the model variant as FP16
    use_safetensors=True                         # Use safetensors format for loading the model
)
base.to("cuda")  # Move the base model to the GPU for faster inference

# Load the refiner model with shared components and specific settings
refiner = DiffusionPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-refiner-1.0",  # The pre-trained refiner model
    text_encoder_2=base.text_encoder_2,             # Share the second text encoder from the base model
    vae=base.vae,                                   # Share the Variational Autoencoder (VAE) from the base model
    torch_dtype=torch.float16,                      # Use 16-bit floating point precision for tensors
    use_safetensors=True,                           # Use safetensors format for loading the model
    variant="fp16"                                  # Specify the model variant as FP16
)
refiner.to("cuda")  # Move the refiner model to the GPU for faster inference

Step 7: Generate and Refine the Image

Define the steps and prompt for generating and refining the image.

# Define the steps and prompt
n_steps = 40  # Number of inference steps for generating and refining the image
high_noise_frac = 0.8  # Fraction of noise to apply during the denoising process
prompt = "A Flying Pig with a sword over a Volcano"  # Text prompt for image generation

# Generate the latent image using the base model
latent_image = base(
    prompt=prompt,                    # The text prompt to generate the initial image
    num_inference_steps=n_steps,      # Number of steps for the inference process
    denoising_end=high_noise_frac,    # End fraction of the denoising process
    output_type="latent"              # Output the result as a latent image
).images

# Refine the image using the refiner model
final_image = refiner(
    prompt=prompt,                    # The same text prompt for refining the image
    num_inference_steps=n_steps,      # Number of steps for the refining process
    denoising_start=high_noise_frac,  # Start fraction of the denoising process
    image=latent_image                # The latent image generated by the base model
).images[0]

# Save the refined image
refined_output_path = os.path.join(output_dir, "refined_image.png")  # Define the output path for the refined image
final_image.save(refined_output_path)  # Save the refined image to the specified path


By following these steps, you can generate and refine images using Stable Diffusion and Hugging Face's Diffusers library. Make sure to update the `output_dir` variable if you want to save images in a different directory. For more information, visit the Hugging Face Stable Diffusion XL Base Model page.


This document has been developed by the Center for Applied AI in Protein Dynamics.