NVIDIA cuDNNis a GPU-accelerated library of primitives for deep neural networks.
cuDNN的安装过程(目前需要登陆获取此链接)
wget https://developer.download.nvidia.cn/compute/machine-learning/cudnn/secure/8.2.2/11.4_07062021/cudnn-11.4-linux-x64-v8.2.2.26.tgz?zVO0xngn9RHkR6idYHi7_WjTxJhRatqOB0Tsrbzn-y1zIokHbv0PQO_U8XLu7aMydM33JWOczvkirvAZ9BNN-aqsIyCpxg5Vc_sbF6AF8K6lGSXQ-CZXUe6IBt-5mcsMERGmkvQACeYRwKLqk7xy76mzV9epqp5_EgFkNFt7RcvA0T97ozdTs6e63yabuR5LkFx-de-Oa6IPbuU
tar xvf *
sudo cp -a include/cudnn.h /usr/local/cuda/include/
sudo cp -a lib64/libcudnn* /usr/local/cuda/lib64/
# nvidia-smi
# nvcc -V
难点还是CUDA的安装
https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#removing-cuda-tk-and-driver
# To remove CUDA Toolkit:
sudo apt-get --purge remove "*cublas*" "*cufft*" "*curand*" \
"*cusolver*" "*cusparse*" "*npp*" "*nvjpeg*" "cuda*" "nsight*"
# To remove NVIDIA Drivers:
sudo apt-get --purge remove "*nvidia*"
# To clean up the uninstall:
sudo apt-get autoremove
驱动尽量单独安装,因为有些不依赖CUDA但依赖驱动特别是要替换原生驱动的话,安装好后设置环境变量
export PATH=/usr/local/cuda-11.4/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-11.4/lib64\
${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
source ~/.bash
如果不安装cuDNN,可能跳过GPU的使用:
2021-08-26 19:55:22.789937: W
tensorflow/stream_executor/platform/default/dso_loader.cc:64]
Could not load dynamic library 'libcudnn.so.8';
dlerror: libcudnn.so.8:
cannot open shared object file: No such file or directory;
LD_LIBRARY_PATH: /usr/local/cuda-11.4/lib64
2021-08-26 19:55:22.790001: W
tensorflow/core/common_runtime/gpu/gpu_device.cc:1835] Cannot dlopen some GPU libraries.
Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU.
Follow the guide at
https://www.tensorflow.org/install/gpu
for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2021-08-26 19:55:22.790631: I
tensorflow/core/platform/cpu_feature_guard.cc:142]
This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)
to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-08-26 19:55:23.528475:
I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185]
None of the MLIR Optimization Passes are enabled (registered 2)
安装cuDNN后,则可使用,也可通过nvidia-smi观察显存等的使用情况
2021-08-30 16:57:03.457415: I
tensorflow/core/platform/cpu_feature_guard.cc:142]
This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)
to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-08-30 16:57:05.198665: I
tensorflow/core/common_runtime/gpu/gpu_device.cc:1510]
Created device /job:localhost/replica:0/task:0/device:GPU:0 with 17540 MB memory:
-> device: 0, name: NVIDIA GeForce RTX 3090,
pci bus id: 0000:02:00.0, compute capability: 8.6
2021-08-30 16:57:06.848155: I
tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185]
None of the MLIR Optimization Passes are enabled (registered 2)
Epoch 1/5
2021-08-30 16:57:10.171347: I
tensorflow/stream_executor/cuda/cuda_blas.cc:1760]
TensorFloat-32 will be used for the matrix multiplication.
This will only be logged once.
Mon Aug 30 17:17:31 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.57.02 Driver Version: 470.57.02 CUDA Version: 11.4 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:02:00.0 Off | N/A |
| 35% 50C P2 109W / 350W | 23055MiB / 24265MiB | 1% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 NVIDIA GeForce ... Off | 00000000:82:00.0 Off | N/A |
| 34% 44C P0 110W / 350W | 0MiB / 24268MiB | 2% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 5741 C python 23053MiB |
+-----------------------------------------------------------------------------+
cuDNN
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。