前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >专栏 >什么是cuDNN?如何安装CUDA和cuDNN

什么是cuDNN?如何安装CUDA和cuDNN

作者头像
CloudStudio
发布于 2025-03-17 11:17:12
发布于 2025-03-17 11:17:12
1.6K00
代码可运行
举报
文章被收录于专栏:《Cloud Studio》《Cloud Studio》
运行总次数:0
代码可运行

文章摘要

文章主要介绍了 nvidia 硬件和驱动、cuda 工具包、cuDNN 系列库和 TensorRT 的相关内容。以 Cloud Studio 为例,讲解了其 GPU 环境的使用,包括开启空间、查看相关版本、安装和验证 cuDNN 等操作,还提及了手动安装/升级 cuDNN 的方法,以及可选的 TensorRT 的安装和验证。

cuDNN 是什么?为什么要安装 cuDNN ?本文将介绍nvidia硬件和驱动(包含 nvidia driver ), cuda 工具包( cuda toolkit ), cuDNN 系列库和 TensorRT ,讲解不同层次硬件和驱动以及软件的关系和作用.并使用腾讯 cloud stuio 做示例,并安装和配置 pytorch 的 GPU 加速。

cloud studio介绍

Cloud Studio(云端 IDE )是基于浏览器的集成式开发环境,为开发者提供了一个稳定的云端工作站。支持 CPU 与 GPU 的访问。用户在使用 Cloud Studio 时无需安装,随时随地打开浏览器即可使用。Cloud Studio 支持免费的 CPU 环境(每月 5w mins )和免费的 GPU 环境(一张Tesla T4 16G)(每月1w mins).本文将用 Cloud Studio 的 GPU 环境演示说明。

开启Cloud Studio GPU空间

-首先注册并开启 Cloud Studio ,点击链接curl.qcloud.com/sdeIX8nx

-点击ide.cloud.tencent.com/ 到Cloud Studio主页面

-如下图,点击空间模版 → AI模版 → Pytorch2.0.0

选择 免费基础版 → 确认

点击高性能工作空间. Pytorch2.0.0 gssrak这个就是已经创建的GPU空间了.可以看到这里已经有绿色圆点,并显示运行中

点击Pytorch2.0.0 gssrak进入空间,等待不到一分钟则会加载完成

Nvidia driver

Nvidia Driver 是专为 nvidia GPU 的驱动程序.有了 Nvidia Drvier ,才可以正确驱动 GPU ,从而正常输出显示画面(针对 studio 专业显卡或者游戏显卡)和加速科学计算(针对数据中心显卡等).它也是之后安装 CUDA toolkit 或者 cuDNN 的基础.

由于 Cloud Studio 基于容器技术,已经在宿主机和 GPU 工作空间(本质是容器)安装了同一版本的 Nvidia Driver .我们可以使用 nvidia-smi 查看

打开终端,输入 nvidia-smi :

代码语言:javascript
代码运行次数:0
运行
AI代码解释
复制
(base) root@VM-24-95-ubuntu:/workspace# nvidia-smi
Mon Mar 10 12:13:25 2025       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.105.17   Driver Version: 525.105.17   CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla T4            On   | 00000000:00:09.0 Off |                    0 |
| N/A   31C    P8    10W /  70W |      2MiB / 15360MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Driver Version: 525.105.17 指Nvidia Driver版本是525.105.17

CUDA Version: 12.0 指目前的Nvidia Driver版本所能支持的 最高 CUDA版本是 12.0

-

也就是此时机器支持CUDA12.0以及 ← CUDA12.0的其他版本(CUDA11.8, CUDA11.7, CUDA10.0 等).另一方面 CUDA12.1, CUDA12.8等高于 CUDA12.0的版本,则不被支持.

CUDA toolkit

CUDA Toolkit 是 NVIDIA 提供的一套完整的开发工具集,用于开发和优化 CUDA 程序.它包括编译器(如 nvcc)、调试器、运行时库(cudart)、性能分析工具以及各种数学和计算库.注意如果只需要运行 tensotflow 或 pytorch 其实不需要安装(完全版) CUDA toolkit ,在安装 pytorch 或者 tensorflow 时候自带的 cuDNN 的子集既可实现 GPU 加速计算.近在需要开发 CUDA 算子,编译 GPU 加速实现(如 Apex 库)等情况下需要安装 CUDA toolkit

Cloud Studio已经默认安装配置了CUDA toolkit 版本11.7

nvcc-V 查看是否安装了 CUDA toolkit

代码语言:javascript
代码运行次数:0
运行
AI代码解释
复制
(base) root@VM-24-95-ubuntu:/workspace# nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Jun__8_16:49:14_PDT_2022
Cuda compilation tools, release 11.7, V11.7.99
Build cuda_11.7.r11.7/compiler.31442593_0

echo $PATH ,检查是否包含过了路径/usr/local/cuda/bin

代码语言:javascript
代码运行次数:0
运行
AI代码解释
复制
(base) root@VM-24-95-ubuntu:/workspace# echo $PATH
/etc/.hai/cloud_studio/vendor/module3/code-oss-dev/bin/remote-cli:/root/miniforge3/bin:/root/miniforge3/condabin:/root/miniforge3/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bi4

echo $LD_LIBRARY_PATH ,检查是否包含过了路径/usr/local/cuda/lib64

代码语言:javascript
代码运行次数:0
运行
AI代码解释
复制
(base) root@VM-24-95-ubuntu:/workspace# echo $LD_LIBRARY_PATH
/usr/local/nvidia/lib:/usr/local/nvidia/lib64:/usr/local/cuda/lib67

cuDNN

cuDNN介绍

NVIDIA CUDA 深度神经网络库(cuDNN) 是一个 GPU 加速的深度神经网络基本操作库。它提供了深度神经网络(DNN)应用中频繁出现的运算的优化实现.cuDNN是实际在tensorflow,pytorch或大模型部署平台的GPU加速的实现。

ref:

官方网站:

https://docs.nvidia.com/cudnn/index.html

官方文档:

https://docs.nvidia.com/deeplearning/

cudnn/latest/

官方安装 linux 下的 cuDNN :

https://docs.nvidia.com/deeplearning/

cudnn/installation/latest/linux.html

#installing-the-cuda-toolkit-for-linux

此时如果按照如上所述使用 Pytorch2.0.0 空间模版则不需要另外再安装 cuDNN ,因为此时 Cloud Studio 已经安装并配置好了 GPU 版本的 pytorch ,也就是说需要的 cuDNN 的子集。

查看cuDNN版本

查看 pytorch 是否可以调用 cuda

python -c "importtorch;

print(torch.cuda.is_available())"

查看 cuDNN 是否启用python -c

"importtorch;print(torch.backends.

cudnn.enabled)"

查看cuDNN版本python -c

"importtorch;print(torch.backends.

cudnn.version())"

代码语言:javascript
代码运行次数:0
运行
AI代码解释
复制
(base) root@VM-24-95-ubuntu:/workspace# python -c "import torch;print(torch.cuda.is_available())"
True
(base) root@VM-24-95-ubuntu:/workspace# python -c "import torch;print(torch.backends.cudnn.enabled)"
True
(base) root@VM-24-95-ubuntu:/workspace# python -c "import torch;print(torch.backends.cudnn.version())"
8500

因为是 pytorch 自带的 cuDNN 的子集,使用代码查看so库 find $(python -c

"import torch; print(torch.__path

__[0])") -name "*cudnn*so*"

代码语言:javascript
代码运行次数:0
运行
AI代码解释
复制
(base) root@VM-24-95-ubuntu:/workspace# find $(python -c "import torch; print(torch.__path__[0])") -name "*cudnn*so*"
/root/miniforge3/lib/python3.10/site-packages/torch/lib/libcudnn.so.8
/root/miniforge3/lib/python3.10/site-packages/torch/lib/libcudnn_adv_infer.so.8
/root/miniforge3/lib/python3.10/site-packages/torch/lib/libcudnn_cnn_train.so.8
/root/miniforge3/lib/python3.10/site-packages/torch/lib/libcudnn_adv_train.so.8
/root/miniforge3/lib/python3.10/site-packages/torch/lib/libcudnn_ops_train.so.8
/root/miniforge3/lib/python3.10/site-packages/torch/lib/libcudnn_cnn_infer.so.8
/root/miniforge3/lib/python3.10/site-packages/torch/lib/libcudnn_ops_infer.so.5

验证cuDNN安装

安装示例文件和依赖apt -y install libcudnn8-samples libfreeimage-dev build-essential由于刚刚看Cloud Studio的pytorch自带的cuDNN是8500版本所以此处安装libcudnn8-samples

编译

cd/usr/src/cudnn_samples_v8/mnistCUDNN && make clean && make

运行./mnistCUDNN 出现Test passed!则为安装cuDNN成功.

logs of `./mnistCUDNN`

代码语言:javascript
代码运行次数:0
运行
AI代码解释
复制
(base) root@VM-24-95-ubuntu:/usr/src/cudnn_samples_v8/mnistCUDNN# make clean && make
rm -rf *o
rm -rf mnistCUDNN
CUDA_VERSION is 11070
Linking agains cublasLt = true
CUDA VERSION: 11070
TARGET ARCH: x86_64
HOST_ARCH: x86_64
TARGET OS: linux
SMS: 35 50 53 60 61 62 70 72 75 80 86 87 
/usr/local/cuda/bin/nvcc -I/usr/local/cuda/include -I/usr/local/cuda/include -IFreeImage/include  -ccbin g++ -m64    -gencode arch=compute_35,code=sm_35 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_53,code=sm_53 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_62,code=sm_62 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_72,code=sm_72 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_87,code=sm_87 -gencode arch=compute_87,code=compute_87 -o fp16_dev.o -c fp16_dev.cu
nvcc warning : The 'compute_35', 'compute_37', 'sm_35', and 'sm_37' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
g++ -I/usr/local/cuda/include -I/usr/local/cuda/include -IFreeImage/include   -o fp16_emu.o -c fp16_emu.cpp
g++ -I/usr/local/cuda/include -I/usr/local/cuda/include -IFreeImage/include   -o mnistCUDNN.o -c mnistCUDNN.cpp
/usr/local/cuda/bin/nvcc   -ccbin g++ -m64      -gencode arch=compute_35,code=sm_35 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_53,code=sm_53 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_62,code=sm_62 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_72,code=sm_72 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_87,code=sm_87 -gencode arch=compute_87,code=compute_87 -o mnistCUDNN fp16_dev.o fp16_emu.o mnistCUDNN.o -I/usr/local/cuda/include -I/usr/local/cuda/include -IFreeImage/include -L/usr/local/cuda/lib64 -L/usr/local/cuda/lib64 -L/usr/local/cuda/lib64 -lcublasLt -LFreeImage/lib/linux/x86_64 -LFreeImage/lib/linux -lcudart -lcublas -lcudnn -lfreeimage -lstdc++ -lm
nvcc warning : The 'compute_35', 'compute_37', 'sm_35', and 'sm_37' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
(base) root@VM-24-95-ubuntu:/usr/src/cudnn_samples_v8/mnistCUDNN# ./mnistCUDNN 
Executing: mnistCUDNN
cudnnGetVersion() : 8500 , CUDNN_VERSION from cudnn.h : 8500 (8.5.0)
Host compiler version : GCC 9.4.0

There are 1 CUDA capable devices on your machine :
device 0 : sms 40  Capabilities 7.5, SmClock 1590.0 Mhz, MemSize (Mb) 14928, MemClock 5001.0 Mhz, Ecc=1, boardGroupID=0
Using device 0

Testing single precision
Loading binary file data/conv1.bin
Loading binary file data/conv1.bias.bin
Loading binary file data/conv2.bin
Loading binary file data/conv2.bias.bin
Loading binary file data/ip1.bin
Loading binary file data/ip1.bias.bin
Loading binary file data/ip2.bin
Loading binary file data/ip2.bias.bin
Loading image data/one_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 57600 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 2057744 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 184784 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.027136 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.027680 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.059392 time requiring 57600 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.095232 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.149504 time requiring 2057744 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 5.357568 time requiring 184784 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 2450080 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 1433120 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 128000 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 4656640 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.088064 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.088352 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.129024 time requiring 4656640 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.135936 time requiring 1433120 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.144864 time requiring 128000 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 5.752384 time requiring 2450080 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Resulting weights from Softmax:
0.0000000 0.9999399 0.0000000 0.0000000 0.0000561 0.0000000 0.0000012 0.0000017 0.0000010 0.0000000 
Loading image data/three_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 57600 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 2057744 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 184784 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.025984 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.030496 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.061536 time requiring 57600 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.085920 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.086048 time requiring 184784 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.118688 time requiring 2057744 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 2450080 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 1433120 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 128000 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 4656640 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.080128 time requiring 2450080 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.086432 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.087552 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.124960 time requiring 1433120 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.135456 time requiring 4656640 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.143360 time requiring 128000 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Resulting weights from Softmax:
0.0000000 0.0000000 0.0000000 0.9999288 0.0000000 0.0000711 0.0000000 0.0000000 0.0000000 0.0000000 
Loading image data/five_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000008 0.0000000 0.0000002 0.0000000 0.9999820 0.0000154 0.0000000 0.0000012 0.0000006 

Result of classification: 1 3 5

Test passed!

Testing half precision (math in single precision)
Loading binary file data/conv1.bin
Loading binary file data/conv1.bias.bin
Loading binary file data/conv2.bin
Loading binary file data/conv2.bias.bin
Loading binary file data/ip1.bin
Loading binary file data/ip1.bias.bin
Loading binary file data/ip2.bin
Loading binary file data/ip2.bias.bin
Loading image data/one_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 2057744 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 184784 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.028000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.030048 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.080224 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.086048 time requiring 2057744 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.093568 time requiring 184784 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 2.026400 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 51584 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 64000 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 4656640 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 2450080 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 1433120 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.104480 time requiring 2450080 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.121888 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.129344 time requiring 4656640 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.133152 time requiring 1433120 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.200096 time requiring 51584 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.919584 time requiring 64000 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Resulting weights from Softmax:
0.0000001 1.0000000 0.0000001 0.0000000 0.0000563 0.0000001 0.0000012 0.0000017 0.0000010 0.0000001 
Loading image data/three_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 2057744 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 184784 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.032352 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.036704 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.037408 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.079872 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.083968 time requiring 184784 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.085984 time requiring 2057744 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 51584 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 64000 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 4656640 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 2450080 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 1433120 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.083360 time requiring 2450080 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.120096 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.124992 time requiring 4656640 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.127648 time requiring 1433120 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.193344 time requiring 51584 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.282880 time requiring 64000 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Resulting weights from Softmax:
0.0000000 0.0000000 0.0000000 1.0000000 0.0000000 0.0000714 0.0000000 0.0000000 0.0000000 0.0000000 
Loading image data/five_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000008 0.0000000 0.0000002 0.0000000 1.0000000 0.0000154 0.0000000 0.0000012 0.0000006 

Result of classification: 1 3 5

Test passed!

手动安装/升级cuDNN(可选)

由于Cloud Studio的AI模版大多是AI框架的cuDNN实现,且Cloud Studio空间自带conda,所以建议使用 pip install 的方式安装.

- 针对cu11.7的情况: pip install nvidia-cudnn-cu11

进一步的,如果你需要其他小版本pip install nvidia-cudnn-cu11==9.x.y.z

- 当然仍然可以使用tarball解压压缩包安装(可参考NVIDIA cuDNN Installation ### Tarball Installation链接:

https://docs.nvidia.com/deeplearning/cudnn/installation/latest/linux.html#tarball-installation

下载压缩包:

wget https://developer.download.nvidia.com/compute/cudnn/redist/cudnn/linux-x86_64/cudnn-linux-x86_64-9.8.0.87_cuda11-archive.tar.xz

解压到 CUDA toolkit文件夹tar -xf cudnn-linux-x86_64-9.8.0.87_cuda11-archive.tar.xz --strip-components=1 -C /usr/local/cuda

- 或者conda安装(可参考NVIDIA cuDNN Installation ### Conda Installation 链接:

https://docs.nvidia.com/deeplearning/cudnn/installation/latest/linux.html#conda-installation ):conda install cudnn cuda-version=-c nvidia

如果使用conda安装了部分依赖,那么建议一直用conda安装升级和管理依赖.若用pip安装依赖,则建议一直pip管理依赖.十分不建议混用,混用很可能出现依赖混乱,以至于需要删掉env重装.

tensorRT(可选)

ensorRT是一个推理加速库,可以大幅加速生产环境的模型推理效果

安装:pip install tensorrt-cu11

验证:

python -c "import

tensorrt;print(tensorrt.__version__);assert tensorrt.Builder(tensorrt.Logger())"

备注:

-由于Cloud Studio默认安装了CUDA toolkit 11.7,那么这里也用cu11的tensorrt版本.

- version10会比是新版本.version8是旧版本(但version8主流);实测Cloud Studio安装version8和10都可以.详情可见下面的log.

- 此时pip install tensorrt-cu11命令默认安装tensortrt cu11 version10

若使用pip install tensorrt命令则会安装tensortrt cu12 version10

若需要安装指定版本则:pip install tensorrt-cu11==10.0.1或pip install tensorrt==8.5.3.1

logs of `pip install tensorrt-cu11`

代码语言:javascript
代码运行次数:0
运行
AI代码解释
复制
(base) root@VM-24-95-ubuntu:/workspace# pip install  tensorrt-cu11
Looking in indexes: http://mirrors.tencentyun.com/pypi/simple
Collecting tensorrt-cu11
  Downloading http://mirrors.tencentyun.com/pypi/packages/ad/04/0d6cffca481309ca0f6904446b4a075ddbf759f249851b54938c43fa6982/tensorrt_cu11-10.9.0.34.tar.gz (18 kB)
  Preparing metadata (setup.py) ... done
Collecting tensorrt_cu11_libs==10.9.0.34 (from tensorrt-cu11)
  Downloading http://mirrors.tencentyun.com/pypi/packages/12/3f/8962914e14e265711f262ad961b437630acacbe794f730f1b6503fe1cec8/tensorrt_cu11_libs-10.9.0.34.tar.gz (704 bytes)
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Collecting tensorrt_cu11_bindings==10.9.0.34 (from tensorrt-cu11)
  Downloading http://mirrors.tencentyun.com/pypi/packages/6e/3c/056876197cf050b064fbc4a89a5f72e092ecf7a4f1454f0ca7c579fbc109/tensorrt_cu11_bindings-10.9.0.34-cp310-none-manylinux_2_28_x86_64.whl (1.2 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.2/1.2 MB 28.1 MB/s eta 0:00:00
Collecting nvidia-cuda-runtime-cu11 (from tensorrt_cu11_libs==10.9.0.34->tensorrt-cu11)
  Downloading http://mirrors.tencentyun.com/pypi/packages/a6/ec/a540f28b31de7bc1ed49eecc72035d4cb77db88ead1d42f7bfa5ae407ac6/nvidia_cuda_runtime_cu11-11.8.89-py3-none-manylinux2014_x86_64.whl (875 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 875.6/875.6 kB 24.6 MB/s eta 0:00:00
Building wheels for collected packages: tensorrt-cu11, tensorrt_cu11_libs
  Building wheel for tensorrt-cu11 (setup.py) ... done
  Created wheel for tensorrt-cu11: filename=tensorrt_cu11-10.9.0.34-py2.py3-none-any.whl size=17466 sha256=48b8117c9b58cef409a1838af20124df8e830c0f91ccb256ce68a34ccb8cbab7
  Stored in directory: /root/.cache/pip/wheels/74/2a/8a/58fb3d73239359b35886927883f9ede3f874dfe000f4847afd
  Building wheel for tensorrt_cu11_libs (pyproject.toml) ... done
  Created wheel for tensorrt_cu11_libs: filename=tensorrt_cu11_libs-10.9.0.34-py2.py3-none-manylinux_2_28_x86_64.whl size=2053243630 sha256=bf85dc722a08f2b28bc206a147737f74c62bf24f93842ea0ab5b6b4094cb0af7
  Stored in directory: /root/.cache/pip/wheels/50/fe/b9/a6137a71b76c0282920b71420d97a280aa7388573cbee6ec28
Successfully built tensorrt-cu11 tensorrt_cu11_libs
Installing collected packages: tensorrt_cu11_bindings, nvidia-cuda-runtime-cu11, tensorrt_cu11_libs, tensorrt-cu11
Successfully installed nvidia-cuda-runtime-cu11-11.8.89 tensorrt-cu11-10.9.0.34 tensorrt_cu11_bindings-10.9.0.34 tensorrt_cu11_libs-10.9.0.34
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager, possibly rendering your system unusable.It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv. Use the --root-user-action option if you know what you are doing and want to suppress this warning.
(base) root@VM-24-95-ubuntu:/workspace# python -c "import tensorrt;print(tensorrt.__version__);assert tensorrt.Builder(tensorrt.Logger())"
10.9.0.34
[03/11/2025-01:49:50] [TRT] [W] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage and speed up TensorRT initialization. See "Lazy Loading" section of CUDA documentation https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#lazy-loading

logs of `pip install tensorrt==8.5.3.1`

代码语言:javascript
代码运行次数:0
运行
AI代码解释
复制
(base) root@VM-24-95-ubuntu:/workspace# pip install  tensorrt==8.5.3.1
Looking in indexes: http://mirrors.tencentyun.com/pypi/simple
Collecting tensorrt==8.5.3.1
  Downloading http://mirrors.tencentyun.com/pypi/packages/3e/d5/5f9dd454a89f5bf09c3740c649ba6c8dd685cae98a1255299a2e1dbac606/tensorrt-8.5.3.1-cp310-none-manylinux_2_17_x86_64.whl (549.5 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 549.5/549.5 MB 47.7 MB/s eta 0:00:00
Requirement already satisfied: nvidia-cuda-runtime-cu11 in /root/miniforge3/lib/python3.10/site-packages (from tensorrt==8.5.3.1) (11.8.89)
Collecting nvidia-cudnn-cu11 (from tensorrt==8.5.3.1)
  Downloading http://mirrors.tencentyun.com/pypi/packages/22/32/6385ef0da5e01553e3b8ad55428fd4824cbff29ff941185082b17f030c9e/nvidia_cudnn_cu11-9.8.0.87-py3-none-manylinux_2_27_x86_64.whl (434.5 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 434.5/434.5 MB 72.8 MB/s eta 0:00:00
Collecting nvidia-cublas-cu11 (from tensorrt==8.5.3.1)
  Downloading http://mirrors.tencentyun.com/pypi/packages/ea/2e/9d99c60771d275ecf6c914a612e9a577f740a615bc826bec132368e1d3ae/nvidia_cublas_cu11-11.11.3.6-py3-none-manylinux2014_x86_64.whl (417.9 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 417.9/417.9 MB 63.4 MB/s eta 0:00:00
Installing collected packages: nvidia-cublas-cu11, nvidia-cudnn-cu11, tensorrt
Successfully installed nvidia-cublas-cu11-11.11.3.6 nvidia-cudnn-cu11-9.8.0.87 tensorrt-8.5.3.1
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager, possibly rendering your system unusable.It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv. Use the --root-user-action option if you know what you are doing and want to suppress this warning.
(base) root@VM-24-95-ubuntu:/workspace# python -c "import tensorrt;print(tensorrt.__version__);assert tensorrt.Builder(tensorrt.Logger())"
8.5.3.1
[03/11/2025-02:03:52] [TRT] [W] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See `CUDA_MODULE_LOADING` in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars

Troubleshooting

1 nvidia驱动搞坏了

问题:

经过一系列的配置和安装,好像搞坏了哪里

原因:

- 可能是使用apt install, 或者bash NVIDIA-Linux-x86_64-XXX.XXX.XXX.run更新了驱动或者CUDA toolkit.然而这样更新驱动在Cloud Studio是不能成功更新的.

- 使用pip install应该不会把驱动环境搞坏

- 由于Cloud Studio的nvidia driver是以只读方式mount在容器空间中的,所以卸载掉用户安装的驱动即可恢复使用本来的驱动.(注意如果用户修改过$PATH或LD_LIBRARY_PATH环境变量,也需要恢复到原来的环境变量)

解决:apt remote *nvidia* -y

附件:

附件1:使用apt install更新驱动之后nvidia-smi错误

代码语言:javascript
代码运行次数:0
运行
AI代码解释
复制
(base) root@VM-24-95-ubuntu:/workspace# apt install nvidia-driver-535 -y
WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

Reading package lists...
Building dependency tree...
Reading state information...
The following additional packages will be installed:
  libnvidia-cfg1-535 libnvidia-common-535 libnvidia-compute-535
  libnvidia-decode-535 libnvidia-encode-535 libnvidia-extra-535
  libnvidia-fbc1-535 libnvidia-gl-535 nvidia-compute-utils-535 nvidia-dkms-535
  nvidia-kernel-common-535 nvidia-kernel-source-535 nvidia-prime
  nvidia-settings nvidia-utils-535 xserver-xorg-video-nvidia-535
Recommended packages:
  libnvidia-compute-535:i386 libnvidia-decode-535:i386
  libnvidia-encode-535:i386 libnvidia-fbc1-535:i386 libnvidia-gl-535:i386
The following NEW packages will be installed:
  libnvidia-cfg1-535 libnvidia-common-535 libnvidia-compute-535
  libnvidia-decode-535 libnvidia-encode-535 libnvidia-extra-535
  libnvidia-fbc1-535 libnvidia-gl-535 nvidia-compute-utils-535 nvidia-dkms-535
  nvidia-driver-535 nvidia-kernel-common-535 nvidia-kernel-source-535
  nvidia-prime nvidia-settings nvidia-utils-535 xserver-xorg-video-nvidia-535
0 upgraded, 17 newly installed, 0 to remove and 4 not upgraded.
Need to get 308 MB of archives.
After this operation, 801 MB of additional disk space will be used.
Get:1 http://mirrors.cloud.tencent.com/ubuntu focal-updates/main amd64 nvidia-prime all 0.8.16~0.20.04.2 [9960 B]
Get:2 https://developer.download.nvidia.cn/compute/cuda/repos/ubuntu2004/x86_64  libnvidia-cfg1-535 535.230.02-0ubuntu1 [98.9 kB]
Get:3 https://developer.download.nvidia.cn/compute/cuda/repos/ubuntu2004/x86_64  libnvidia-common-535 535.230.02-0ubuntu1 [14.9 kB]
Get:4 https://developer.download.nvidia.cn/compute/cuda/repos/ubuntu2004/x86_64  libnvidia-compute-535 535.230.02-0ubuntu1 [36.9 MB]
Get:5 https://developer.download.nvidia.cn/compute/cuda/repos/ubuntu2004/x86_64  libnvidia-decode-535 535.230.02-0ubuntu1 [1660 kB]
Get:6 https://developer.download.nvidia.cn/compute/cuda/repos/ubuntu2004/x86_64  libnvidia-encode-535 535.230.02-0ubuntu1 [90.0 kB]
Get:7 https://developer.download.nvidia.cn/compute/cuda/repos/ubuntu2004/x86_64  libnvidia-extra-535 535.230.02-0ubuntu1 [256 kB]
Get:8 https://developer.download.nvidia.cn/compute/cuda/repos/ubuntu2004/x86_64  libnvidia-fbc1-535 535.230.02-0ubuntu1 [51.3 kB]
Get:9 https://developer.download.nvidia.cn/compute/cuda/repos/ubuntu2004/x86_64  libnvidia-gl-535 535.230.02-0ubuntu1 [183 MB]
Get:10 https://developer.download.nvidia.cn/compute/cuda/repos/ubuntu2004/x86_64  nvidia-compute-utils-535 535.230.02-0ubuntu1 [285 kB]
Get:11 https://developer.download.nvidia.cn/compute/cuda/repos/ubuntu2004/x86_64  nvidia-kernel-source-535 535.230.02-0ubuntu1 [44.5 MB]
Get:12 https://developer.download.nvidia.cn/compute/cuda/repos/ubuntu2004/x86_64  nvidia-kernel-common-535 535.230.02-0ubuntu1 [38.4 MB]
Get:13 https://developer.download.nvidia.cn/compute/cuda/repos/ubuntu2004/x86_64  nvidia-dkms-535 535.230.02-0ubuntu1 [34.2 kB]
Get:14 https://developer.download.nvidia.cn/compute/cuda/repos/ubuntu2004/x86_64  nvidia-utils-535 535.230.02-0ubuntu1 [382 kB]
Get:15 https://developer.download.nvidia.cn/compute/cuda/repos/ubuntu2004/x86_64  xserver-xorg-video-nvidia-535 535.230.02-0ubuntu1 [1504 kB]
Get:16 https://developer.download.nvidia.cn/compute/cuda/repos/ubuntu2004/x86_64  nvidia-driver-535 535.230.02-0ubuntu1 [478 kB]
Get:17 https://developer.download.nvidia.cn/compute/cuda/repos/ubuntu2004/x86_64  nvidia-settings 570.124.06-0ubuntu1 [951 kB]
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
  LANGUAGE = (unset),
  LC_ALL = (unset),
  LC_CTYPE = "C.UTF-8",
  LANG = "en_US.UTF-8"
    are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
locale: Cannot set LC_MESSAGES to default locale: No such file or directory
locale: Cannot set LC_ALL to default locale: No such file or directory
debconf: delaying package configuration, since apt-utils is not installed
Fetched 308 MB in 27s (11.5 MB/s)
Selecting previously unselected package libnvidia-cfg1-535:amd64.
(Reading database ... 
(Reading database ... 5%
(Reading database ... 10%
(Reading database ... 15%
(Reading database ... 20%
(Reading database ... 25%
(Reading database ... 30%
(Reading database ... 35%
(Reading database ... 40%
(Reading database ... 45%
(Reading database ... 50%
(Reading database ... 55%
(Reading database ... 60%
(Reading database ... 65%
(Reading database ... 70%
(Reading database ... 75%
(Reading database ... 80%
(Reading database ... 85%
(Reading database ... 90%
(Reading database ... 95%
(Reading database ... 100%
(Reading database ... 84774 files and directories currently installed.)
Preparing to unpack .../00-libnvidia-cfg1-535_535.230.02-0ubuntu1_amd64.deb ...
Unpacking libnvidia-cfg1-535:amd64 (535.230.02-0ubuntu1) ...
Selecting previously unselected package libnvidia-common-535.
Preparing to unpack .../01-libnvidia-common-535_535.230.02-0ubuntu1_all.deb ...
Unpacking libnvidia-common-535 (535.230.02-0ubuntu1) ...
Selecting previously unselected package libnvidia-compute-535:amd64.
Preparing to unpack .../02-libnvidia-compute-535_535.230.02-0ubuntu1_amd64.deb ...
Unpacking libnvidia-compute-535:amd64 (535.230.02-0ubuntu1) ...
Selecting previously unselected package libnvidia-decode-535:amd64.
Preparing to unpack .../03-libnvidia-decode-535_535.230.02-0ubuntu1_amd64.deb ...
Unpacking libnvidia-decode-535:amd64 (535.230.02-0ubuntu1) ...
Selecting previously unselected package libnvidia-encode-535:amd64.
Preparing to unpack .../04-libnvidia-encode-535_535.230.02-0ubuntu1_amd64.deb ...
Unpacking libnvidia-encode-535:amd64 (535.230.02-0ubuntu1) ...
Selecting previously unselected package libnvidia-extra-535:amd64.
Preparing to unpack .../05-libnvidia-extra-535_535.230.02-0ubuntu1_amd64.deb ...
Unpacking libnvidia-extra-535:amd64 (535.230.02-0ubuntu1) ...
Selecting previously unselected package libnvidia-fbc1-535:amd64.
Preparing to unpack .../06-libnvidia-fbc1-535_535.230.02-0ubuntu1_amd64.deb ...
Unpacking libnvidia-fbc1-535:amd64 (535.230.02-0ubuntu1) ...
Selecting previously unselected package libnvidia-gl-535:amd64.
Preparing to unpack .../07-libnvidia-gl-535_535.230.02-0ubuntu1_amd64.deb ...
dpkg-query: no packages found matching libnvidia-gl-450
Unpacking libnvidia-gl-535:amd64 (535.230.02-0ubuntu1) ...
Preparing to unpack .../08-nvidia-compute-utils-535_535.230.02-0ubuntu1_amd64.deb ...
Unpacking nvidia-compute-utils-535 (535.230.02-0ubuntu1) ...
dpkg: error processing archive /tmp/apt-dpkg-install-weWcQR/08-nvidia-compute-utils-535_535.230.02-0ubuntu1_amd64.deb (--unpack):
 unable to make backup link of './usr/bin/nvidia-cuda-mps-control' before installing new version: Invalid cross-device link
dpkg-deb: error: paste subprocess was killed by signal (Broken pipe)
Selecting previously unselected package nvidia-kernel-source-535.
Preparing to unpack .../09-nvidia-kernel-source-535_535.230.02-0ubuntu1_amd64.deb ...
Unpacking nvidia-kernel-source-535 (535.230.02-0ubuntu1) ...
Selecting previously unselected package nvidia-kernel-common-535.
Preparing to unpack .../10-nvidia-kernel-common-535_535.230.02-0ubuntu1_amd64.deb ...
Unpacking nvidia-kernel-common-535 (535.230.02-0ubuntu1) ...
Selecting previously unselected package nvidia-dkms-535.
Preparing to unpack .../11-nvidia-dkms-535_535.230.02-0ubuntu1_amd64.deb ...
Unpacking nvidia-dkms-535 (535.230.02-0ubuntu1) ...
Preparing to unpack .../12-nvidia-utils-535_535.230.02-0ubuntu1_amd64.deb ...
Unpacking nvidia-utils-535 (535.230.02-0ubuntu1) ...
dpkg: error processing archive /tmp/apt-dpkg-install-weWcQR/12-nvidia-utils-535_535.230.02-0ubuntu1_amd64.deb (--unpack):
 unable to make backup link of './usr/bin/nvidia-debugdump' before installing new version: Invalid cross-device link
dpkg-deb: error: paste subprocess was killed by signal (Broken pipe)
Selecting previously unselected package xserver-xorg-video-nvidia-535.
Preparing to unpack .../13-xserver-xorg-video-nvidia-535_535.230.02-0ubuntu1_amd64.deb ...
Unpacking xserver-xorg-video-nvidia-535 (535.230.02-0ubuntu1) ...
Selecting previously unselected package nvidia-driver-535.
Preparing to unpack .../14-nvidia-driver-535_535.230.02-0ubuntu1_amd64.deb ...
Unpacking nvidia-driver-535 (535.230.02-0ubuntu1) ...
Selecting previously unselected package nvidia-prime.
Preparing to unpack .../15-nvidia-prime_0.8.16~0.20.04.2_all.deb ...
Unpacking nvidia-prime (0.8.16~0.20.04.2) ...
Selecting previously unselected package nvidia-settings.
Preparing to unpack .../16-nvidia-settings_570.124.06-0ubuntu1_amd64.deb ...
Unpacking nvidia-settings (570.124.06-0ubuntu1) ...
Errors were encountered while processing:
 /tmp/apt-dpkg-install-weWcQR/08-nvidia-compute-utils-535_535.230.02-0ubuntu1_amd64.deb
 /tmp/apt-dpkg-install-weWcQR/12-nvidia-utils-535_535.230.02-0ubuntu1_amd64.deb
E: Sub-process /usr/bin/dpkg returned an error code (1)
(base) root@VM-24-95-ubuntu:/workspace# nvidia-smi
Failed to initialize NVML: Driver/library version mismatc10

附件2:使用apt remove修复后nvidia-smi正常了

代码语言:javascript
代码运行次数:0
运行
AI代码解释
复制
(base) root@VM-24-95-ubuntu:/workspace# apt remove *nvidia* -y

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

Reading package lists...
Building dependency tree...
Reading state information...
Package 'nvidia-304' is not installed, so not removed
这里省略了一些log
Package 'linux-objects-nvidia-535-server-5.15.0-1049-aws' is not installed, so not removed
Package 'linux-objects-nvidia-535-server-5.15.0-1049-azure' is not installed, so not removed
Package 'linux-objects-nvidia-535-server-5.15.0-1049-gcp' is not installed, so not removed
Package 'linux-objects-nvidia-535-server-5.15.0-1049-intel-iotg' is not installed, so not removed
Package 'linux-objects-nvidia-535-server-5.15.0-1049-oracle' is not installed, so not removed
Package 'linux-objects-nvidia-535-server-5.15.0-105-generic' is not installed, so not removed
Package 'linux-objects-nvidia-535-server-5.15.0-105-lowlatency' is not installed, so not removed
Package 'linux-objects-nvidia-535-server-5.15.0-1050-aws' is not installed, so not removed
Package 'linux-objects-nvidia-535-server-5.15.0-1050-azure' is not installed, so not removed
Package 'linux-objects-nvidia-535-server-5.15.0-1050-intel-iotg' is not installed, so not removed
Package 'linux-objects-nvidia-535-server-5.15.0-1050-oracle' is not installed, so not removed
Package 'linux-objects-nvidia-535-server-5.15.0-1051-aws' is not installed, so not removed
Package 'linux-objects-nvidia-535-server-5.15.0-1051-azure' is not installed, so not removed
Package 'linux-objects-nvidia-535-server-5.15.0-1051-gcp' is not installed, so not removed
Package 'linux-objects-nvidia-535-server-5.15.0-1051-oracle' is not installed, so not removed
Package 'linux-objects-nvidia-535-server-5.15.0-1052-aws' is not installed, so not removed
Package 'linux-objects-nvidia-535-server-5.15.0-1052-azure' is not installed, so not removed
Package 'linux-objects-nvidia-535-server-5.15.0-1052-gcp' is not installed, so not removed
Package 'linux-objects-nvidia-535-server-5.15.0-1052-intel-iotg' is not installed, so not removed
Package 'linux-objects-nvidia-535-server-5.15.0-1052-oracle' is not installed, so not removed
Package 'linux-objects-nvidia-535-server-5.15.0-1053-aws' is not installed, so not removed
Package 'linux-objects-nvidia-535-server-5.15.0-1053-azure' is not installed, so not removed
Package 'linux-objects-nvidia-535-server-5.15.0-1053-gcp' is not installed, so not removed
Package 'linux-objects-nvidia-535-server-5.15.0-1053-oracle' is not installed, so not removed
Package 'linux-objects-nvidia-535-server-5.15.0-1054-azure' is not installed, so not removed
Package 'linux-objects-nvidia-535-server-5.15.0-1055-aws' is not installed, so not removed
Package 'linux-objects-nvidia-535-server-5.15.0-1055-gcp' is not installed, so not removed
Package 'linux-objects-nvidia-535-server-5.15.0-1055-intel-iotg' is not installed, so not removed
Package 'linux-objects-nvidia-535-server-5.15.0-1055-oracle' is not installed, so not removed
Package 'linux-objects-nvidia-535-server-5.15.0-1056-azure' is not installed, so not removed
Package 'linux-objects-nvidia-535-server-5.15.0-1057-aws' is not installed, so not removed
Package 'linux-objects-nvidia-535-server-5.15.0-1057-azure' is not installed, so not removed
Package 'linux-objects-nvidia-535-server-5.15.0-1058-aws' is not installed, so not removed
Package 'linux-objects-nvidia-535-server-5.15.0-1058-azure' is not installed, so not removed
Package 'linux-objects-nvidia-535-server-5.15.0-1058-gcp' is not installed, so not removed
Package 'linux-objects-nvidia-535-server-5.15.0-1058-intel-iotg' is not installed, so not removed
Package 'linux-objects-nvidia-535-server-5.15.0-1058-oracle' is not installed, so not removed
Package 'linux-objects-nvidia-535-server-5.15.0-1059-gcp' is not installed, so not removed
Package 'linux-objects-nvidia-535-server-5.15.0-1059-intel-iotg' is not installed, so not removed
Package 'linux-objects-nvidia-535-server-5.15.0-1059-oracle' is not installed, 
The following packages were automatically installed and are no longer required:
  accountsservice acl apg apport apport-symptoms aptdaemon aptdaemon-data
  aspell aspell-en avahi-daemon avahi-utils bind9-host bind9-libs bluez bolt
  bubblewrap cheese-common colord colord-data cracklib-runtime crda
  cups-pk-helper dconf-cli dctrl-tools desktop-file-utils dictionaries-common
  dkms dns-root-data dnsmasq-base docbook-xml emacsen-common enchant-2
  evolution-data-server evolution-data-server-common fprintd gdm3 geoclue-2.0
  gettext-base gir1.2-accountsservice-1.0 gir1.2-atk-1.0 gir1.2-atspi-2.0
  gir1.2-freedesktop gir1.2-gck-1 gir1.2-gcr-3 gir1.2-gdesktopenums-3.0
  gir1.2-gdkpixbuf-2.0 gir1.2-gdm-1.0 gir1.2-geoclue-2.0
  gir1.2-gnomebluetooth-1.0 gir1.2-gnomedesktop-3.0 gir1.2-graphene-1.0
  gir1.2-gtk-3.0 gir1.2-gweather-3.0 gir1.2-ibus-1.0 gir1.2-json-1.0
  gir1.2-mutter-6 gir1.2-nm-1.0 gir1.2-nma-1.0 gir1.2-notify-0.7
  gir1.2-pango-1.0 gir1.2-polkit-1.0 gir1.2-rsvg-2.0 gir1.2-secret-1
  gir1.2-soup-2.4 gir1.2-upowerglib-1.0 gir1.2-vte-2.91 gjs gkbd-capplet
  gnome-control-center gnome-control-center-data gnome-control-center-faces
  gnome-desktop3-data gnome-keyring gnome-keyring-pkcs11 gnome-menus
  gnome-online-accounts gnome-session-bin gnome-session-common
  gnome-settings-daemon gnome-settings-daemon-common gnome-shell
  gnome-shell-common gnome-startup-applications gnome-user-docs groff-base
  gstreamer1.0-clutter-3.0 gstreamer1.0-gl gstreamer1.0-plugins-base
  gstreamer1.0-plugins-good gstreamer1.0-pulseaudio gstreamer1.0-x
  hunspell-en-us ibus ibus-data ibus-gtk ibus-gtk3 iio-sensor-proxy im-config
  ippusbxd iptables iw keyboard-configuration kmod language-selector-common
  language-selector-gnome libaa1 libaccountsservice0 libappindicator3-1
  libarchive13 libasound2-plugins libaspell15 libasyncns0 libavahi-core7
  libavahi-glib1 libavc1394-0 libbluetooth3 libboost-thread1.71.0 libcaca0
  libcamel-1.2-62 libcanberra-gtk3-0 libcanberra-gtk3-module libcanberra-pulse
  libcdparanoia0 libcheese-gtk25 libcheese8 libclutter-1.0-0
  libclutter-1.0-common libclutter-gst-3.0-0 libclutter-gtk-1.0-0
  libcogl-common libcogl-pango20 libcogl-path20 libcogl20 libcolord-gtk1
  libcolorhug2 libcrack2 libdaemon0 libdbusmenu-glib4 libdbusmenu-gtk3-4
  libdrm-amdgpu1 libdrm-common libdrm-intel1 libdrm-nouveau2 libdrm-radeon1
  libdrm2 libdv4 libebackend-1.2-10 libebook-1.2-20 libebook-contacts-1.2-3
  libecal-2.0-1 libedata-book-1.2-26 libedata-cal-2.0-1 libedataserver-1.2-24
  libedataserverui-1.2-2 libegl-mesa0 libegl1 libenchant-2-2 libevdev2
  libexif12 libflac8 libfontenc1 libfprint-2-2 libgail-common libgail18
  libgbm1 libgd3 libgdata-common libgdata22 libgdm1 libgee-0.8-2
  libgeoclue-2-0 libgeocode-glib0 libgjs0g libgl1 libgl1-mesa-dri
  libglapi-mesa libgles2 libglvnd0 libglx-mesa0 libglx0 libgnome-autoar-0-0
  libgnome-bluetooth13 libgnome-desktop-3-19 libgnomekbd-common libgnomekbd8
  libgoa-1.0-0b libgoa-1.0-common libgoa-backend-1.0-1 libgphoto2-6
  libgphoto2-l10n libgphoto2-port12 libgraphene-1.0-0 libgsound0
  libgssdp-1.2-0 libgstreamer-gl1.0-0 libgstreamer-plugins-base1.0-0
  libgstreamer-plugins-good1.0-0 libgtk2.0-0 libgtk2.0-bin libgtk2.0-common
  libgtop-2.0-11 libgtop2-common libgudev-1.0-0 libgupnp-1.2-0
  libgupnp-av-1.0-2 libgupnp-dlna-2.0-3 libgusb2 libgweather-3-16
  libgweather-common libharfbuzz-icu0 libhunspell-1.7-0 libhyphen0
  libibus-1.0-5 libical3 libice6 libidn11 libiec61883-0 libieee1284-3
  libimobiledevice6 libinput-bin libinput10 libip6tc2 libjack-jackd2-0
  libjansson4 libjavascriptcoregtk-4.0-18 libldb2 libllvm12 libmaxminddb0
  libmbim-glib4 libmbim-proxy libmediaart-2.0-0 libmm-glib0 libmnl0
  libmozjs-68-0 libmp3lame0 libmpg123-0 libmtdev1 libmutter-6-0
  libmysqlclient21 libndp0 libnetfilter-conntrack3 libnewt0.52 libnfnetlink0
  libnftnl11 libnl-3-200 libnl-genl-3-200 libnl-route-3-200 libnm0 libnma0
  libnotify4 libnspr4 libnss-mdns libnss3 libopengl0 libopus0 liborc-0.4-0
  libpam-fprintd libpam-gnome-keyring libpangoxft-1.0-0 libpcap0.8 libpci3
  libpciaccess0 libpcsclite1 libphonenumber7 libpipeline1 libplist3
  libprotobuf17 libpulse-mainloop-glib0 libpulse0 libpulsedsp
  libpwquality-common libpwquality1 libqmi-glib5 libqmi-proxy libraw1394-11
  librygel-core-2.6-2 librygel-db-2.6-2 librygel-renderer-2.6-2
  librygel-server-2.6-2 libsamplerate0 libsane libsane-common libsbc1
  libsensors-config libsensors5 libshout3 libslang2 libsm6 libsmbclient
  libsnapd-glib1 libsndfile1 libsnmp-base libsnmp35 libsodium23 libsoxr0
  libspeex1 libspeexdsp1 libstartup-notification0 libtag1v5 libtag1v5-vanilla
  libtalloc2 libteamdctl0 libtevent0 libtext-iconv-perl libtheora0 libtwolame0
  libuchardet0 libudisks2-0 libunwind8 libupower-glib3 libusb-1.0-0
  libusbmuxd6 libuv1 libv4l-0 libv4lconvert0 libvdpau1 libvisual-0.4-0
  libvorbisenc2 libvpx6 libvte-2.91-0 libvte-2.91-common libvulkan1
  libwacom-bin libwacom-common libwacom2 libwavpack1 libwayland-server0
  libwbclient0 libwebkit2gtk-4.0-37 libwebpdemux2 libwebrtc-audio-processing1
  libwhoopsie-preferences0 libwhoopsie0 libwoff1 libx11-xcb1 libxatracker2
  libxaw7 libxcb-dri2-0 libxcb-dri3-0 libxcb-glx0 libxcb-icccm4 libxcb-image0
  libxcb-keysyms1 libxcb-present0 libxcb-randr0 libxcb-render-util0
  libxcb-res0 libxcb-shape0 libxcb-sync1 libxcb-util1 libxcb-xfixes0
  libxcb-xkb1 libxcb-xv0 libxfont2 libxft2 libxkbcommon-x11-0 libxkbfile1
  libxklavier16 libxmu6 libxnvctrl0 libxpm4 libxshmfence1 libxslt1.1 libxss1
  libxt6 libxtables12 libxv1 libxvmc1 libxxf86vm1 libyelp0
  linux-headers-5.4.0-208 linux-headers-5.4.0-208-generic
  linux-headers-generic man-db mesa-vdpau-drivers mesa-vulkan-drivers
  mobile-broadband-provider-info modemmanager mousetweaks mutter mutter-common
  mysql-common network-manager network-manager-gnome network-manager-pptp
  p11-kit p11-kit-modules pci.ids pkg-config ppp pptp-linux pulseaudio
  pulseaudio-module-bluetooth pulseaudio-utils python3-apport
  python3-aptdaemon python3-aptdaemon.gtk3widgets python3-blinker
  python3-cairo python3-cffi-backend python3-cryptography python3-cups
  python3-cupshelpers python3-defer python3-entrypoints python3-httplib2
  python3-ibus-1.0 python3-jwt python3-keyring python3-launchpadlib
  python3-lazr.restfulclient python3-lazr.uri python3-ldb
  python3-macaroonbakery python3-nacl python3-oauthlib python3-problem-report
  python3-protobuf python3-pymacaroons python3-rfc3339 python3-secretstorage
  python3-simplejson python3-systemd python3-talloc python3-tz python3-wadllib
  python3-xkit rtkit rygel samba-libs sane-utils screen-resolution-extra
  session-migration sgml-base sgml-data sudo switcheroo-control
  system-config-printer system-config-printer-common
  system-config-printer-udev ubuntu-docs ubuntu-session ubuntu-wallpapers
  ubuntu-wallpapers-focal udev update-inetd upower usb-modeswitch
  usb-modeswitch-data usb.ids usbmuxd vdpau-driver-all wamerican
  whoopsie-preferences wireless-regdb wpasupplicant x11-xkb-utils
  x11-xserver-utils xdg-dbus-proxy xfonts-base xfonts-encodings xfonts-utils
  xml-core xserver-common xserver-xephyr xserver-xorg xserver-xorg-core
  xserver-xorg-input-all xserver-xorg-input-libinput xserver-xorg-input-wacom
  xserver-xorg-legacy xserver-xorg-video-all xserver-xorg-video-amdgpu
  xserver-xorg-video-ati xserver-xorg-video-fbdev xserver-xorg-video-intel
  xserver-xorg-video-nouveau xserver-xorg-video-qxl xserver-xorg-video-radeon
  xserver-xorg-video-vesa xserver-xorg-video-vmware xwayland
  yaru-theme-gnome-shell yelp yelp-xsl zenity zenity-common
Use 'apt autoremove' to remove them.
The following packages will be REMOVED:
  libnvidia-cfg1-535 libnvidia-common-535 libnvidia-compute-535
  libnvidia-decode-535 libnvidia-encode-535 libnvidia-extra-535
  libnvidia-fbc1-535 libnvidia-gl-535 nvidia-dkms-535 nvidia-driver-535
  nvidia-kernel-common-535 nvidia-kernel-source-535 nvidia-prime
  nvidia-settings xserver-xorg-video-nvidia-535
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
  LANGUAGE = (unset),
  LC_ALL = (unset),
  LC_CTYPE = "C.UTF-8",
  LANG = "en_US.UTF-8"
    are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
locale: Cannot set LC_MESSAGES to default locale: No such file or directory
locale: Cannot set LC_ALL to default locale: No such file or directory
0 upgraded, 0 newly installed, 15 to remove and 4 not upgraded.
15 not fully installed or removed.
After this operation, 798 MB disk space will be freed.
(Reading database ... 
(Reading database ... 5%
(Reading database ... 10%
(Reading database ... 15%
(Reading database ... 20%
(Reading database ... 25%
(Reading database ... 30%
(Reading database ... 35%
(Reading database ... 40%
(Reading database ... 45%
(Reading database ... 50%
(Reading database ... 55%
(Reading database ... 60%
(Reading database ... 65%
(Reading database ... 70%
(Reading database ... 75%
(Reading database ... 80%
(Reading database ... 85%
(Reading database ... 90%
(Reading database ... 95%
(Reading database ... 100%
(Reading database ... 85474 files and directories currently installed.)
Removing nvidia-driver-535 (535.230.02-0ubuntu1) ...
Removing xserver-xorg-video-nvidia-535 (535.230.02-0ubuntu1) ...
Removing libnvidia-cfg1-535:amd64 (535.230.02-0ubuntu1) ...
Removing libnvidia-gl-535:amd64 (535.230.02-0ubuntu1) ...
Removing libnvidia-common-535 (535.230.02-0ubuntu1) ...
Removing libnvidia-encode-535:amd64 (535.230.02-0ubuntu1) ...
Removing libnvidia-decode-535:amd64 (535.230.02-0ubuntu1) ...
Removing libnvidia-compute-535:amd64 (535.230.02-0ubuntu1) ...
Removing libnvidia-extra-535:amd64 (535.230.02-0ubuntu1) ...
Removing libnvidia-fbc1-535:amd64 (535.230.02-0ubuntu1) ...
Removing nvidia-dkms-535 (535.230.02-0ubuntu1) ...
Removing nvidia-kernel-common-535 (535.230.02-0ubuntu1) ...
Removing nvidia-kernel-source-535 (535.230.02-0ubuntu1) ...
Removing nvidia-prime (0.8.16~0.20.04.2) ...
Removing nvidia-settings (570.124.06-0ubuntu1) ...
Processing triggers for mime-support (3.64ubuntu1) ...
Processing triggers for gnome-menus (3.36.0-1ubuntu1) ...
Processing triggers for libc-bin (2.31-0ubuntu9.17) ...
Processing triggers for man-db (2.9.1-1) ...
Processing triggers for dbus (1.12.16-2ubuntu2.3) ...
Processing triggers for desktop-file-utils (0.24-1ubuntu3) ...
(base) root@VM-24-95-ubuntu:/workspace# nvidia-smi
Fri Mar 14 03:24:46 2025       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.105.17   Driver Version: 525.105.17   CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla T4            On   | 00000000:00:09.0 Off |                    0 |
| N/A   31C    P8     9W /  70W |      2MiB / 15360MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
本文参与 腾讯云自媒体同步曝光计划,分享自微信公众号。
原始发表:2025-03-14,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 腾讯云CloudStudio 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
暂无评论
推荐阅读
什么是cuDNN?如何安装CUDA和cuDNN
文章原文指路:https://juejin.cn/post/7479993915041660968
CloudStudio
2025/03/13
9130
RTX4060+ubuntu22.04.3+cuda12.4.1+cudnn9.1.1安装验证
这款笔记本电脑因为触摸板驱动限制,需要选择ubuntu22.04.3以上版本;而RTX4060+cuda限制需要选择ubuntu22.04.3版本,更新版本的cuda并非一定不能使用,而是官方文档验证过的是此版本。
tankaro
2025/02/23
4930
Nvidia 3060显卡 CUDA环境搭建(Ubuntu22.04+Nvidia 510+Cuda11.6+cudnn8.8)
对每个人而言,真正的职责只有一个:找到自我。然后在心中坚守其一生,全心全意,永不停息。所有其它的路都是不完整的,是人的逃避方式,是对大众理想的懦弱回归,是随波逐流,是对内心的恐惧 ——赫尔曼·黑塞《德米安》
山河已无恙
2023/08/21
1.9K1
Nvidia 3060显卡 CUDA环境搭建(Ubuntu22.04+Nvidia 510+Cuda11.6+cudnn8.8)
Nvidia 显卡 Failed to initialize NVML Driver/library version mismatch 错误解决方案
本文记录错误 Failed to initialize NVML: Driver/library version mismatch 错误解决方案。 问题复现 $ nvidia-smi --> Failed to initialize NVML: Driver/library version mismatch 问题分析 NVIDIA 内核驱动版本与系统驱动不一致 查看显卡驱动所使用的内核版本 cat /proc/driver/nvidia/version --> NVRM version: NV
为为为什么
2022/08/05
15.1K1
Nvidia 显卡 Failed to initialize NVML Driver/library version mismatch 错误解决方案
RDMA - GDR GPU Direct RDMA快速入门1
NVIDIA GPUDirect 是一系列技术, 用于增强 GPU间(P2P)或GPU与第三方设备(RDMA)间的数据移动和访问, 无论您是在探索海量数据、研究科学问题、训练神经网络还是为金融市场建模,您都需要一个具有最高数据吞吐量的计算平台。GPU 的数据处理速度比 CPU 快得多,随着 GPU 计算能力的提高,对 IO 带宽的需求也随之增加。NVIDIA GPUDirect®是Magnum IO的一部分,可增强 NVIDIA 数据中心 GPU 的数据移动和访问。使用 GPUDirect,网络适配器和存储驱动器可以直接读取和写入 GPU 内存,从而消除不必要的内存复制、减少 CPU 开销和延迟,从而显着提高性能。这些技术(包括 GPUDirect Storage(GDS)、GPUDirect RDMA(GDR)、GPUDirect 点对点 (P2P) 和 GPUDirect Video)通过一套全面的 API 呈现
晓兵
2025/03/30
1.1K0
RDMA - GDR GPU Direct RDMA快速入门1
Ubuntu22安装N卡驱动以及CUDA
官网网址:https://www.nvidia.com/Download/index.aspx?lang=en-us
Here_SDUT
2024/02/03
3.9K0
Ubuntu22安装N卡驱动以及CUDA
cuDNN installation
NVIDIA cuDNNis a GPU-accelerated library of primitives for deep neural networks.
vanguard
2021/08/30
1.2K0
TensorRT安装及使用教程「建议收藏」
一般的深度学习项目,训练时为了加快速度,会使用多 GPU 分布式训练。但在部署推理时,为了降低成本,往往使用单个 GPU 机器甚至嵌入式平台(比如 NVIDIA Jetson)进行部署,部署端也要有与训练时相同的深度学习环境,如 caffe,TensorFlow 等。由于训练的网络模型可能会很大(比如,inception,resnet 等),参数很多,而且部署端的机器性能存在差异,就会导致推理速度慢,延迟高。这对于那些高实时性的应用场合是致命的,比如自动驾驶要求实时目标检测,目标追踪等。所以为了提高部署推理的速度,出现了很多轻量级神经网络,比如 squeezenet,mobilenet,shufflenet 等。基本做法都是基于现有的经典模型提出一种新的模型结构,然后用这些改造过的模型重新训练,再重新部署。
全栈程序员站长
2022/07/31
15.7K0
TensorRT安装及使用教程「建议收藏」
人工智能NVIDIA显卡计算(CUDA+CUDNN)平台搭建
NVIDIA是GPU(图形处理器)的发明者,也是人工智能计算的引领者。我们创建了世界上最大的游戏平台和世界上最快的超级计算机。
小陈运维
2021/10/13
1.4K0
Ubuntu安装和卸载CUDA和CUDNN
最近在学习PaddlePaddle在各个显卡驱动版本的安装和使用,所以同时也学习如何在Ubuntu安装和卸载CUDA和CUDNN,在学习过程中,顺便记录学习过程。在供大家学习的同时,也在加强自己的记忆。本文章以卸载CUDA 8.0 和 CUDNN 7.05 为例,以安装CUDA 10.0 和 CUDNN 7.4.2 为例。
夜雨飘零
2020/05/06
10.4K0
Linux常用技巧系列: Centos7/Ubuntu 16.04 系统Cuda 8.0 / 9.0 安装 + Cudnn
推荐时间1min30s,网上已有多关于cuda安装教程,但往往不是这有问题,就是那有问题。这里写一个简单易懂可行的cuda 安装教程。
超级小可爱
2023/02/20
8280
Caffe框架整理
Caffe框架下载地址:https://github.com/BVLC/caffe
算法之名
2023/11/08
2350
Ubuntu18.04安装 NVIDIA驱动+CUDA10.2+cuDNN+TensorRT
之后,按照提示安装,成功后重启即可。 如果提示安装失败,不要着急重启;可重复上述步骤,多试几次。
全栈程序员站长
2022/08/19
2.1K0
纯净Ubuntu16安装CUDA(9.1)和cuDNN
本篇概览 自己有一台2015年的联想笔记本,显卡是GTX950M,已安装ubuntu 16.04 LTS桌面版,为了使用其GPU完成deeplearning4j的训练工作,自己动手安装了CUDA和cuDNN,在此将整个过程记录下来,以备将来参考,整个安装过程分为以下几步: 准备工作 安装Nvidia驱动 安装CUDA 安装cuDNN 特别问题说明 按照一般步骤,在安装完Nvidia显卡驱动后,会提示对应的CUDA版本,接下来按照提示的版本安装CUDA,例如我这里提示的是11.2,正常情况下,我应该安装11.
程序员欣宸
2021/12/07
7040
纯净Ubuntu16安装CUDA(9.1)和cuDNN
NVIDIA Blackwell RTX GPU与CUDA 12.8框架更新指南
随着NVIDIA Blackwell RTX GPU的发布,为了确保应用程序与这些新一代GPU的兼容性和最佳性能,应用开发者必须更新到最新的AI框架。NVIDIA专门发布了一个指南,详细介绍与NVIDIA Blackwell RTX GPU兼容所需的核心软件库更新,特别是CUDA 12.8的相关信息。
GPUS Lady
2025/02/04
2.2K0
NVIDIA Blackwell RTX GPU与CUDA 12.8框架更新指南
【安装教程】Ubuntu16.04+Caffe+英伟达驱动410+Cuda10.0+Cudnn7.5+Python2.7+Opencv3.4.6安装教程
对于caffe的安装过程,可以说是让我终身难忘。两个星期就为了一个caffe,这其中的心路历程只有自己懂。从实验室的低配置显卡开始装Ubuntu,到编译caffe,解决各种报错,这个过程花费了一周的时间。把cuda版本和N卡驱动版本一降再降,仍然不管用。因此手剁了一台8000的高配置主机。之后为了平衡实验室项目,首先花了半天时间将win10下的相关和其他杂七杂八的软件配置。只有以为只需Ubuntu安装好,caffe编译成功即可,不想安装完Ubuntu之后,却电脑没有引导启动项,把网上的方法试了个遍,却仍无法解决。因此听到一种说法是,win10的启动路径覆盖了Ubuntu启动路径。因此,决定重新再来,将自己的固态和机械全部初始化,首先在固态上安装Ubuntu16.04,在机械上安装Win10,对于双系统的安装请参照我的另一篇博客:Win10与Ubuntu16.04双系统安装教程。在这种情况下参加那个caffe安装成功。请注意,对于双系统建议先安装Ubuntu,并将caffe编译成功之后在去机械上安装Win10。Caffe的安装教程请参照如下安装教程。
AI那点小事
2020/04/20
1.9K0
【安装教程】Ubuntu16.04+Caffe+英伟达驱动410+Cuda10.0+Cudnn7.5+Python2.7+Opencv3.4.6安装教程
TKE集群ubuntu 16.04节点更新GPU驱动和CUDA Toolkit
Release Notes :: CUDA Toolkit Documentation
铜锣烧
2021/07/19
1.6K0
玩转 AIGC:打造 AI 大模型云主机,Ubuntu 24.04 LTS 安装 Docker 和 NVIDIA Container Toolkit
今天分享的内容是 玩转 AIGC「2024」 系列文档中的 打造本地 AI 大模型地基,Ubuntu 24.04 LTS 安装 Docker 和 NVIDIA Container Toolkit。
运维有术
2024/05/07
1.2K0
玩转 AIGC:打造 AI 大模型云主机,Ubuntu 24.04 LTS 安装 Docker 和 NVIDIA Container Toolkit
GTX 1080+Ubuntu16.04+CUDA8.0+cuDNN5.0+TensorFlow
GTX 1080+Ubuntu16.04+CUDA8.0+cuDNN5.0+TensorFlow 安装指导
CreateAMind
2018/07/25
8400
GTX 1080+Ubuntu16.04+CUDA8.0+cuDNN5.0+TensorFlow
windows 11 搭建 TensorFlow GPU 开发环境【RTX 3060】:2 -- 基于WSL2 docker 方式的使用
目前我看官网主要推荐docker 方式了,那我们就用docker 方式试试。而且网上的安装教程也是docker 的居多【官方给出了一个教程】,我们也要与时俱进。
流川疯
2021/12/08
3.4K0
windows 11 搭建 TensorFlow GPU 开发环境【RTX 3060】:2 -- 基于WSL2 docker 方式的使用
推荐阅读
相关推荐
什么是cuDNN?如何安装CUDA和cuDNN
更多 >
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档