Setting up Stable Diffusion with Hugging Face's Diffusers Library
Accessing the ACCRE Visualization Portal
To start, you'll need to access the ACCRE Visualization Portal and launch a Jupyter Notebook session with GPU support. Follow these steps:
1. Go to the ACCRE Visualization Portal.
2. Click on ACCRE Visualization Portal.
3. Enter your VUID and password on the login page.

4. Once logged in, go to the top menu and click on Interactive Apps.
5. Select Jupyter Notebook (GPU).
6. For the 'GPU Enabled Slurm Account', type accre_workshops_acc.
7. For the 'Number of hours', type 1.
8. For the GPU architecture, choose A6000.
9. For the Python version, choose Python 3.9.12/Anaconda 2022.05.
10. Click Launch.
11. On the next page, click Connect to Jupyter.

12. In the Jupyter interface, click on the New tab and select Terminal where we will do our package installations.
Step 1: Install Necessary Packages
First, you need to install the required packages. Run the following commands:
# Copy this lines to install PyTorch with CUDA support in the terminal pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118 # Install diffusers, invisible_watermark, transformers, accelerate, and safetensors pip install diffusers invisible_watermark transformers accelerate safetensors
13. In the Jupyter interface, click on the New tab and select Python 3.
Now you are ready to follow the guide below to set up Stable Diffusion.
Setting up Stable Diffusion with Hugging Face's Diffusers Library
This guide will walk you through the process of generating and refining images using Hugging Face's Diffusers library. Follow the steps below to install necessary packages, set up directories, load models, generate images, and refine them.

Step 2: Import Required Libraries
Import the necessary libraries to use the DiffusionPipeline and other modules.
from diffusers import DiffusionPipeline import torch import os
Step 3: Set Up Directories
Create an output directory to save the generated images.
# Set up directories !mkdir /home/<YOURUSERNAME>/saved_img output_dir = "/home/<YOURUSERNAME>/saved_img/" os.makedirs(output_dir, exist_ok=True)
Step 4: Load the Base Model
Load the base model from Hugging Face's repository.
# Load the base model
pipe = DiffusionPipeline.from_pretrained(
"stabilityai/stable-diffusion-xl-base-1.0",
torch_dtype=torch.float16,
use_safetensors=True,
variant="fp16"
)
pipe.to("cuda")
Step 5: Generate an Image with the Base Model
Generate an image using the base model with a given prompt.
# Generate an image with the base model prompt = "An astronaut riding a green horse" image = pipe(prompt=prompt).images[0] # Save the generated image output_path = os.path.join(output_dir, "output_image.png") image.save(output_path)
Step 6: Load the Base and Refiner Models
Load the base and refiner models for further refinement.
# Load the base model with specific settings
base = DiffusionPipeline.from_pretrained(
"stabilityai/stable-diffusion-xl-base-1.0", # The pre-trained base model
torch_dtype=torch.float16, # Use 16-bit floating point precision for tensors
variant="fp16", # Specify the model variant as FP16
use_safetensors=True # Use safetensors format for loading the model
)
base.to("cuda") # Move the base model to the GPU for faster inference
# Load the refiner model with shared components and specific settings
refiner = DiffusionPipeline.from_pretrained(
"stabilityai/stable-diffusion-xl-refiner-1.0", # The pre-trained refiner model
text_encoder_2=base.text_encoder_2, # Share the second text encoder from the base model
vae=base.vae, # Share the Variational Autoencoder (VAE) from the base model
torch_dtype=torch.float16, # Use 16-bit floating point precision for tensors
use_safetensors=True, # Use safetensors format for loading the model
variant="fp16" # Specify the model variant as FP16
)
refiner.to("cuda") # Move the refiner model to the GPU for faster inference
Step 7: Generate and Refine the Image
Define the steps and prompt for generating and refining the image.
# Define the steps and prompt
n_steps = 40 # Number of inference steps for generating and refining the image
high_noise_frac = 0.8 # Fraction of noise to apply during the denoising process
prompt = "A Flying Pig with a sword over a Volcano" # Text prompt for image generation
# Generate the latent image using the base model
latent_image = base(
prompt=prompt, # The text prompt to generate the initial image
num_inference_steps=n_steps, # Number of steps for the inference process
denoising_end=high_noise_frac, # End fraction of the denoising process
output_type="latent" # Output the result as a latent image
).images
# Refine the image using the refiner model
final_image = refiner(
prompt=prompt, # The same text prompt for refining the image
num_inference_steps=n_steps, # Number of steps for the refining process
denoising_start=high_noise_frac, # Start fraction of the denoising process
image=latent_image # The latent image generated by the base model
).images[0]
# Save the refined image
refined_output_path = os.path.join(output_dir, "refined_image.png") # Define the output path for the refined image
final_image.save(refined_output_path) # Save the refined image to the specified path
By following these steps, you can generate and refine images using Stable Diffusion and Hugging Face's Diffusers library. Make sure to update the `output_dir` variable if you want to save images in a different directory.
For more information, visit the Hugging Face Stable Diffusion XL Base Model page.
This document has been developed by the Center for Applied AI in Protein Dynamics.