cuDNN installation

原创

vanguard

修改于 2021-08-30 18:08:18

1.1K0

修改于 2021-08-30 18:08:18

文章被收录于专栏：vanguard

NVIDIA cuDNNis a GPU-accelerated library of primitives for deep neural networks.

硬件准备(电源+主板+处理器+风扇+内存+外存/NVMESSD/HDD+Nvidia显卡)
操作系统和工具安装(Ubuntu20.04+update+net-tools+ssh+vim+python3-pip+samba+git+xrdp+virtualenv)
显卡驱动和英伟达软件安装(Driver+CUDA+cuDNN+TensorRT)
依赖软件和框架安装(tensorflow-gpu+pytorch+opencv-python+yolo...)
容器化或直接训练模型和推理(docker+nvidia-docker...)

cuDNN的安装过程(目前需要登陆获取此链接)

wget https://developer.download.nvidia.cn/compute/machine-learning/cudnn/secure/8.2.2/11.4_07062021/cudnn-11.4-linux-x64-v8.2.2.26.tgz?zVO0xngn9RHkR6idYHi7_WjTxJhRatqOB0Tsrbzn-y1zIokHbv0PQO_U8XLu7aMydM33JWOczvkirvAZ9BNN-aqsIyCpxg5Vc_sbF6AF8K6lGSXQ-CZXUe6IBt-5mcsMERGmkvQACeYRwKLqk7xy76mzV9epqp5_EgFkNFt7RcvA0T97ozdTs6e63yabuR5LkFx-de-Oa6IPbuU
tar xvf *
sudo cp -a include/cudnn.h /usr/local/cuda/include/
sudo cp -a lib64/libcudnn* /usr/local/cuda/lib64/
# nvidia-smi
# nvcc -V

难点还是CUDA的安装

https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#removing-cuda-tk-and-driver

# To remove CUDA Toolkit:
sudo apt-get --purge remove "*cublas*" "*cufft*" "*curand*" \
 "*cusolver*" "*cusparse*" "*npp*" "*nvjpeg*" "cuda*" "nsight*" 
# To remove NVIDIA Drivers:
sudo apt-get --purge remove "*nvidia*"
# To clean up the uninstall:
sudo apt-get autoremove

驱动尽量单独安装，因为有些不依赖CUDA但依赖驱动特别是要替换原生驱动的话，安装好后设置环境变量

export PATH=/usr/local/cuda-11.4/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-11.4/lib64\
                                 ${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
source ~/.bash

如果不安装cuDNN，可能跳过GPU的使用：

2021-08-26 19:55:22.789937: W 
tensorflow/stream_executor/platform/default/dso_loader.cc:64] 
Could not load dynamic library 'libcudnn.so.8'; 
dlerror: libcudnn.so.8: 
cannot open shared object file: No such file or directory; 
LD_LIBRARY_PATH: /usr/local/cuda-11.4/lib64

2021-08-26 19:55:22.790001: W 
tensorflow/core/common_runtime/gpu/gpu_device.cc:1835] Cannot dlopen some GPU libraries. 
Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. 
Follow the guide at 
https://www.tensorflow.org/install/gpu 
for how to download and setup the required libraries for your platform.

Skipping registering GPU devices...

2021-08-26 19:55:22.790631: I 
tensorflow/core/platform/cpu_feature_guard.cc:142] 
This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) 
to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.

2021-08-26 19:55:23.528475: 
I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] 
None of the MLIR Optimization Passes are enabled (registered 2)

安装cuDNN后，则可使用，也可通过nvidia-smi观察显存等的使用情况

2021-08-30 16:57:03.457415: I 
tensorflow/core/platform/cpu_feature_guard.cc:142] 
This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) 
to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.

2021-08-30 16:57:05.198665: I 
tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] 
Created device /job:localhost/replica:0/task:0/device:GPU:0 with 17540 MB memory:  
-> device: 0, name: NVIDIA GeForce RTX 3090, 
pci bus id: 0000:02:00.0, compute capability: 8.6

2021-08-30 16:57:06.848155: I 
tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] 
None of the MLIR Optimization Passes are enabled (registered 2)

Epoch 1/5
2021-08-30 16:57:10.171347: I 
tensorflow/stream_executor/cuda/cuda_blas.cc:1760] 
TensorFloat-32 will be used for the matrix multiplication. 
This will only be logged once.

Mon Aug 30 17:17:31 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.57.02    Driver Version: 470.57.02    CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:02:00.0 Off |                  N/A |
| 35%   50C    P2   109W / 350W |  23055MiB / 24265MiB |      1%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  NVIDIA GeForce ...  Off  | 00000000:82:00.0 Off |                  N/A |
| 34%   44C    P0   110W / 350W |      0MiB / 24268MiB |      2%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      5741      C   python                          23053MiB |
+-----------------------------------------------------------------------------+

cuDNN

原创声明：本文系作者授权腾讯云开发者社区发表，未经许可，不得转载。

如有侵权，请联系 cloudcommunity@tencent.com 删除。

https