如何在win 10 上本地部署Stable Diffusion的方法介绍

通信行业搬砖工

发布于 2024-11-14 16:38:15

15000

代码可运行

文章被收录于专栏：网络虚拟化网络虚拟化

运行总次数：0

代码可运行

01=Stable Diffusion简介

在当下人工智能的快速发展，Stable Diffusion作为一种先进的基于深度学习的图像生成技术，在图像处理、深度学习技术和开源等方面具有显著优势，适合不同应用场景的需求

一、技术背景与原理

Stable Diffusion是Diffusion扩散模型中的最新版本，由CompVis、Stability AI和LAION的研究人员在Latent Diffusion Model的基础上创建并推出，于2022年8月由Stability AI公司正式发布。它采用了更加稳定、可控和高效的方法来生成高质量图像，展示了多模态领域中，如何通过深度学习将文字信息转化为视觉内容的前沿技术。

Stable Diffusion使用条件变分自编码器（Conditional Variational Autoencoder）来生成图像，该结构可以将输入的文本或图像的条件分布转换为隐变量的条件分布，使得生成的图像更加符合输入的条件分布。同时，Stable Diffusion还结合了生成对抗网络（GANs）的思想，通过训练一个生成器和一个判别器，不断竞争和学习，从而生成越来越逼真的图像。

二、特点与优势

高效性与稳定性：Stable Diffusion在保证生成图像质量和真实感的同时，具有较好的稳定性和速度。其最新的XL版本在生成效率上比以往的Diffusion扩散模型提高了30倍，使得图像生成可以直接在消费级显卡上实现。
高分辨率与高逼真度：Stable Diffusion能够生成高分辨率、高逼真度的图像，最新的XL版本甚至可以在1024x1024像素的级别上生成可控的图像。
开源优势：Stable Diffusion是开源的，可以在本地部署，拥有更多的可调节参数和插件，对于图像的控制力会大大增加。
应用场景多样性：Stable Diffusion技术可用于图像检索、图像生成、风格迁移、图像修复等多种应用场景。在艺术创作、电影特效、游戏开发、医学图像处理等领域具有广泛的应用潜力。

三、工作流程

Stable Diffusion的工作流程细致地将文本转换为图像，涵盖从文本解析到图像细化的各个阶段。以下是该过程的详细步骤：

用户输入的文本通过一个高级文本编码器（如GPT或BERT等Transformer模型）进行处理。
通过变分自编码器（VAE）在潜在空间中进一步处理图像。VAE帮助模型在保持图像质量的同时，优化和细化图像的细节。
在图像生成的最后阶段，使用超分辨率技术对图像进行处理，以提高其分辨率和细节质量。这一步是通过另一种扩散模型完成的，专注于从较低分辨率的图像中恢复细节，确保最终图像在视觉上的高质量和精细度。

四、应用领域

艺术创作：艺术家和设计师可以利用Stable Diffusion进行数字艺术创作，通过输入创意描述，生成符合艺术家创意意图的高质量图像或动画。
图案设计与广告制作：设计师可以利用Stable Diffusion进行图案设计和广告制作，通过技术体验到不同文字、图片带来的不同艺术效果。
电影制作与游戏开发：Stable Diffusion可用于电影特效的制作和游戏开发中的场景、角色设计等。
医学图像处理：Stable Diffusion可以用于医学图像的修复、增强和去噪等处理，提高医学图像的质量和可读性。

02=安装前准备

在安装SD之前，我们需要先安装Python，推荐使用python 3.10以上版本。

1、先下载Python软件，选择下载想要安装的版本Python版本

https://www.python.org/downloads/

2、双击安装包安装该版本软件，勾选添加值环境变量。

3、安装过程...(略)

4、安装成功后，查看版本信息

03=

安装Stable Diffusion

1、下载Stable Diffusion软件

git clone https://github.com/Stability-AI/stablediffusion.git

github项目截图：

项目文件夹展示：

2、安装依赖库：

安装pip工具：

python -m pip install --upgrade pip

安装依赖库：采用安装命令pip install albumentations==0.4.3依次安装如下软件库。

albumentations==0.4.3
opencv-python
pudb==2019.2
imageio==2.9.0
imageio-ffmpeg==0.4.2
pytorch-lightning==1.4.2
torchmetrics==0.6
omegaconf==2.1.1
test-tube>=0.7.5
streamlit>=0.73.1
einops==0.3.0
transformers==4.19.2
webdataset==0.2.5
open-clip-torch==2.7.0
gradio==3.13.2
kornia==0.6
invisible-watermark>=0.1.5
streamlit-drawable-canvas==0.8.0

批量安装方法：

pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple

下载预训练模型v2-1_768-ema-pruned.ckpt，并存放到checkpoints文件夹中。下载网址（https://huggingface.co/）

04=

运行Stable Diffusion

1、运行Stable Diffusion软件

python ./scripts/txt2img.py --prompt "a professional photograph of an astronaut riding a horse" --ckpt ./checkpoints/v2-1_768-ema-pruned.ckpt --config ./configs/stable-diffusion/v2-inference-v.yaml --H 768 --W 768

出现问题：ModuleNotFoundError: No module named 'ldm'

F:\StableDiffusion\stablediffusion>
F:\StableDiffusion\stablediffusion>python ./scripts/txt2img.py --prompt "a professional photograph of an astronaut riding a horse" --ckpt ./checkpoints/v2-1_768-ema-pruned.ckpt --config ./configs/stable-diffusion/v2-inference-v.yaml --H 768 --W 768
Traceback (most recent call last):
  File "F:\StableDiffusion\stablediffusion\scripts\txt2img.py", line 16, in <module>
    from ldm.util import instantiate_from_config
ModuleNotFoundError: No module named 'ldm'

解决方法：将ldm复制或者创建软连接到scripts文件夹路径下

继续执行运行Stable Diffusion指令

F:\StableDiffusion\stablediffusion>python ./scripts/txt2img.py --prompt "a professional photograph of an astronaut riding a horse" --ckpt ./checkpoints/v2-1_768-ema-pruned.ckpt --config ./configs/stable-diffusion/v2-inference-v.yaml --H 768 --W 768
F:\StableDiffusion\stablediffusion\scripts\ldm\models\diffusion\dpm_solver\dpm_solver.py:16: SyntaxWarning: invalid escape sequence '\h'
  """Create a wrapper class for the forward SDE (VP type).
Global seed set to 42
Loading model from ./checkpoints/v2-1_768-ema-pruned.ckpt
F:\StableDiffusion\stablediffusion\scripts\txt2img.py:30: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  pl_sd = torch.load(ckpt, map_location="cpu")
Global Step: 110000
No module 'xformers'. Proceeding without it.
LatentDiffusion: Running in v-prediction mode
DiffusionWrapper has 865.91 M params.
making attention of type 'vanilla' with 512 in_channels
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
making attention of type 'vanilla' with 512 in_channels
Downloading: "https://github.com/DagnyT/hardnet/raw/master/pretrained/train_liberty_with_aug/checkpoint_liberty_with_aug.pth" to C:\Users\fangt/.cache\torch\hub\checkpoints\checkpoint_liberty_with_aug.pth
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5.10M/5.10M [00:00<00:00, 9.56MB/s]

如果执行过程中遇到空间问题：

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 9.49 GiB. GPU 0 has a total capacty of 11.99 GiB of which 0 bytes is free. Of the allocated memory 14.77 GiB is allocated by PyTorch, and 9.52 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

解决方案：调整生成大小为H：512 W:512 来节约空间。--H 512 --W 512

F:\StableDiffusion\stablediffusion>python ./scripts/txt2img.py --prompt "a professional photograph of an astronaut riding a horse" --ckpt ./checkpoints/v2-1_768-ema-pruned.ckpt --config ./configs/stable-diffusion/v2-inference-v.yaml --H 512 --W 512