技术标签: 机器学习 driver 深度学习 AI 神经网络 日志 nvidia fastai
最近在开fastai提供的AI教程,刚好自己的电脑上有nvidia独显(GPU),先前因为耗电温度高就切换到了内置显卡.是时候实现你的价值了nvidia,出来吧小宝贝.执行召唤咒语:nvidia-settings后傻眼了:
ERROR: NVIDIA driver is not loaded
ERROR: Unable to load info from any available system
(nvidia-settings:317): GLib-GObject-CRITICAL **: 06:42:43.821: g_object_unref: assertion 'G_IS_OBJECT (object)' failed
** Message: 06:42:43.855: PRIME: No offloading required. Abort
** Message: 06:42:43.855: PRIME: is it supported? no
执行nvidia-smi也是报同样的错误,驱动没了,难道我上次切换显卡时直接把驱动也卸载了,什么时候残忍的斩草还除根了?实在想不起,还是先重新安装下吧:
sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt-get update
sudo apt-get install nvidia-460 #此处要根据上面查询到的版本适当更改
sudo apt-get install mesa-common-dev
sudo apt-get install freeglut3-dev
安装挺顺利的,确认下安装日志也没发现什么问题:
tianlang@tianlang:spark$ sudo apt-get install nvidia-driver-460
正在读取软件包列表... 完成
正在分析软件包的依赖关系树
正在读取状态信息... 完成
将会同时安装下列软件:
libnvidia-cfg1-460 libnvidia-compute-460
libnvidia-compute-460:i386 libnvidia-decode-460
libnvidia-decode-460:i386 libnvidia-encode-460
libnvidia-encode-460:i386 libnvidia-extra-460
libnvidia-fbc1-460 libnvidia-fbc1-460:i386 libnvidia-gl-460
libnvidia-gl-460:i386 libnvidia-ifr1-460
libnvidia-ifr1-460:i386 nvidia-compute-utils-460
nvidia-dkms-460 nvidia-kernel-common-460
nvidia-kernel-source-460 nvidia-utils-460
xserver-xorg-video-nvidia-460
下列软件包将被升级:
libnvidia-cfg1-460 libnvidia-compute-460
libnvidia-compute-460:i386 libnvidia-decode-460
libnvidia-decode-460:i386 libnvidia-encode-460
libnvidia-encode-460:i386 libnvidia-extra-460
libnvidia-fbc1-460 libnvidia-fbc1-460:i386 libnvidia-gl-460
libnvidia-gl-460:i386 libnvidia-ifr1-460
libnvidia-ifr1-460:i386 nvidia-compute-utils-460
nvidia-dkms-460 nvidia-driver-460 nvidia-kernel-common-460
nvidia-kernel-source-460 nvidia-utils-460
xserver-xorg-video-nvidia-460
升级了 21 个软件包,新安装了 0 个软件包,要卸载 0 个软件包,有 4 个软件包未被升级。
需要下载 175 MB 的归档。
解压缩后会消耗 156 kB 的额外空间。
您希望继续执行吗? [Y/n] Y
获取:1 http://ppa.launchpad.net/graphics-drivers/ppa/ubuntu bionic/main amd64 nvidia-driver-460 amd64 460.67-0ubuntu0~0.18.04.1 [433 kB]
...
已下载 175 MB,耗时 11分 55秒 (245 kB/s)
(正在读取数据库 ... 系统当前共安装有 296611 个文件和目录。)
正准备解包 .../00-nvidia-driver-460_460.67-0ubuntu0~0.18.04.1_amd64.deb ...
正在将 nvidia-driver-460 (460.67-0ubuntu0~0.18.04.1) 解包到 (460.56-0ubuntu0.18.04.1) 上 ...
正准备解包 .../01-libnvidia-gl-460_460.67-0ubuntu0~0.18.04.1_amd64.deb ...
...
Removing all DKMS Modules
Done.
正在将 nvidia-dkms-460 (460.67-0ubuntu0~0.18.04.1) 解包到 (460.56-0ubuntu0.18.04.1) 上 ...
正准备解包 .../04-nvidia-kernel-source-460_460.67-0ubuntu0~0.18.04.1_amd64.deb ...
正在将 nvidia-kernel-source-460 (460.67-0ubuntu0~0.18.04.1) 解包到 (460.56-0ubuntu0.18.04.1) 上 ...
正准备解包 .../05-nvidia-kernel-common-460_460.67-0ubuntu0~0.18.04.1_amd64.deb ...
正在将 nvidia-kernel-common-460 (460.67-0ubuntu0~0.18.04.1) 解包到 (460.56-0ubuntu0.18.04.1) 上 ...
正准备解包 .../06-libnvidia-decode-460_460.67-0ubuntu0~0.18.04.1_i386.deb ...
正在反配置 libnvidia-decode-460:amd64 (460.56-0ubuntu0.18.04.1) ...
正在将 libnvidia-decode-460:i386 (460.67-0ubuntu0~0.18.04.1) 解包到 (460.56-0ubuntu0.18.04.1) 上 ...
正准备解包 .../07-libnvidia-decode-460_460.67-0ubuntu0~0.18.04.1_amd64.deb ...
正在将 libnvidia-decode-460:amd64 (460.67-0ubuntu0~0.18.04.1) 解包到 (460.56-0ubuntu0.18.04.1) 上 ...
正准备解包 .../08-libnvidia-compute-460_460.67-0ubuntu0~0.18.04.1_amd64.deb ...
正在反配置 libnvidia-compute-460:i386 (460.56-0ubuntu0.18.04.1) ...
正在将 libnvidia-compute-460:amd64 (460.67-0ubuntu0~0.18.04.1) 解包到 (460.56-0ubuntu0.18.04.1) 上 ...
正准备解包 .../09-libnvidia-compute-460_460.67-0ubuntu0~0.18.04.1_i386.deb ...
正在将 libnvidia-compute-460:i386 (460.67-0ubuntu0~0.18.04.1) 解包到 (460.56-0ubuntu0.18.04.1) 上 ...
正准备解包 .../10-libnvidia-extra-460_460.67-0ubuntu0~0.18.04.1_amd64.deb ...
正在将 libnvidia-extra-460:amd64 (460.67-0ubuntu0~0.18.04.1) 解包到 (460.56-0ubuntu0.18.04.1) 上 ...
正准备解包 .../11-nvidia-compute-utils-460_460.67-0ubuntu0~0.18.04.1_amd64.deb ...
正在将 nvidia-compute-utils-460 (460.67-0ubuntu0~0.18.04.1) 解包到 (460.56-0ubuntu0.18.04.1) 上 ...
正准备解包 .../12-libnvidia-encode-460_460.67-0ubuntu0~0.18.04.1_amd64.deb ...
正在反配置 libnvidia-encode-460:i386 (460.56-0ubuntu0.18.04.1) ...
正在将 libnvidia-encode-460:amd64 (460.67-0ubuntu0~0.18.04.1) 解包到 (460.56-0ubuntu0.18.04.1) 上 ...
正准备解包 .../13-libnvidia-encode-460_460.67-0ubuntu0~0.18.04.1_i386.deb ...
正在将 libnvidia-encode-460:i386 (460.67-0ubuntu0~0.18.04.1) 解包到 (460.56-0ubuntu0.18.04.1) 上 ...
正准备解包 .../14-nvidia-utils-460_460.67-0ubuntu0~0.18.04.1_amd64.deb ...
正在将 nvidia-utils-460 (460.67-0ubuntu0~0.18.04.1) 解包到 (460.56-0ubuntu0.18.04.1) 上 ...
正准备解包 .../15-xserver-xorg-video-nvidia-460_460.67-0ubuntu0~0.18.04.1_amd64.deb ...
正在将 xserver-xorg-video-nvidia-460 (460.67-0ubuntu0~0.18.04.1) 解包到 (460.56-0ubuntu0.18.04.1) 上 ...
正准备解包 .../16-libnvidia-ifr1-460_460.67-0ubuntu0~0.18.04.1_amd64.deb ...
正在反配置 libnvidia-ifr1-460:i386 (460.56-0ubuntu0.18.04.1) ...
正在将 libnvidia-ifr1-460:amd64 (460.67-0ubuntu0~0.18.04.1) 解包到 (460.56-0ubuntu0.18.04.1) 上 ...
正准备解包 .../17-libnvidia-ifr1-460_460.67-0ubuntu0~0.18.04.1_i386.deb ...
正在将 libnvidia-ifr1-460:i386 (460.67-0ubuntu0~0.18.04.1) 解包到 (460.56-0ubuntu0.18.04.1) 上 ...
正准备解包 .../18-libnvidia-fbc1-460_460.67-0ubuntu0~0.18.04.1_amd64.deb ...
正在反配置 libnvidia-fbc1-460:i386 (460.56-0ubuntu0.18.04.1) ...
正在将 libnvidia-fbc1-460:amd64 (460.67-0ubuntu0~0.18.04.1) 解包到 (460.56-0ubuntu0.18.04.1) 上 ...
正准备解包 .../19-libnvidia-fbc1-460_460.67-0ubuntu0~0.18.04.1_i386.deb ...
正在将 libnvidia-fbc1-460:i386 (460.67-0ubuntu0~0.18.04.1) 解包到 (460.56-0ubuntu0.18.04.1) 上 ...
正准备解包 .../20-libnvidia-cfg1-460_460.67-0ubuntu0~0.18.04.1_amd64.deb ...
正在将 libnvidia-cfg1-460:amd64 (460.67-0ubuntu0~0.18.04.1) 解包到 (460.56-0ubuntu0.18.04.1) 上 ...
正在设置 libnvidia-extra-460:amd64 (460.67-0ubuntu0~0.18.04.1) ...
正在设置 libnvidia-fbc1-460:i386 (460.67-0ubuntu0~0.18.04.1) ...
正在设置 libnvidia-fbc1-460:amd64 (460.67-0ubuntu0~0.18.04.1) ...
正在设置 libnvidia-gl-460:i386 (460.67-0ubuntu0~0.18.04.1) ...
正在设置 libnvidia-gl-460:amd64 (460.67-0ubuntu0~0.18.04.1) ...
正在设置 libnvidia-ifr1-460:amd64 (460.67-0ubuntu0~0.18.04.1) ...
正在设置 libnvidia-ifr1-460:i386 (460.67-0ubuntu0~0.18.04.1) ...
正在设置 libnvidia-compute-460:amd64 (460.67-0ubuntu0~0.18.04.1) ...
正在设置 libnvidia-compute-460:i386 (460.67-0ubuntu0~0.18.04.1) ...
正在设置 nvidia-kernel-source-460 (460.67-0ubuntu0~0.18.04.1) ...
正在设置 nvidia-utils-460 (460.67-0ubuntu0~0.18.04.1) ...
正在设置 nvidia-kernel-common-460 (460.67-0ubuntu0~0.18.04.1) ...
update-initramfs: deferring update (trigger activated)
正在设置 libnvidia-cfg1-460:amd64 (460.67-0ubuntu0~0.18.04.1) ...
正在设置 libnvidia-decode-460:amd64 (460.67-0ubuntu0~0.18.04.1) ...
正在设置 libnvidia-decode-460:i386 (460.67-0ubuntu0~0.18.04.1) ...
正在设置 nvidia-compute-utils-460 (460.67-0ubuntu0~0.18.04.1) ...
正在设置 libnvidia-encode-460:amd64 (460.67-0ubuntu0~0.18.04.1) ...
正在设置 libnvidia-encode-460:i386 (460.67-0ubuntu0~0.18.04.1) ...
正在设置 xserver-xorg-video-nvidia-460 (460.67-0ubuntu0~0.18.04.1) ...
正在设置 nvidia-dkms-460 (460.67-0ubuntu0~0.18.04.1) ...
update-initramfs: deferring update (trigger activated)
INFO:Enable nvidia
DEBUG:Parsing /usr/share/ubuntu-drivers-common/quirks/dell_latitude
DEBUG:Parsing /usr/share/ubuntu-drivers-common/quirks/lenovo_thinkpad
DEBUG:Parsing /usr/share/ubuntu-drivers-common/quirks/put_your_quirks_here
Loading new nvidia-460.67 DKMS files...
Building for 4.15.0-141-generic
Building for architecture x86_64
Building initial module for 4.15.0-141-generic
Secure Boot not enabled on this system.
Done.
nvidia:
Running module version sanity check.
- Original module
- This kernel never originally had a module by this name
- Installation
- Installing to /lib/modules/4.15.0-141-generic/extra/
nvidia-modeset.ko:
Running module version sanity check.
Good news! Module version 460.67 for nvidia-modeset.ko
exactly matches what is already found in kernel 4.15.0-141-generic.
DKMS will not replace this module.
You may override by specifying --force.
nvidia-drm.ko:
Running module version sanity check.
- Original module
- This kernel never originally had a module by this name
- Installation
- Installing to /lib/modules/4.15.0-141-generic/extra/
nvidia-uvm.ko:
Running module version sanity check.
Good news! Module version for nvidia-uvm.ko
exactly matches what is already found in kernel 4.15.0-141-generic.
DKMS will not replace this module.
You may override by specifying --force.
depmod...
DKMS: install completed.
...
为了安全期间又重启了下电脑,再次召唤nvidia,还是熟悉的配方熟悉的味道.
这就有点诡异了,找gpu管理员了解下情况吧:
spark$ sudo gpu-manager
last_boot_file: /var/lib/ubuntu-drivers-common/last_gfx_boot
new_boot_file: /var/lib/ubuntu-drivers-common/last_gfx_boot
can't access /run/u-d-c-nvidia-was-loaded file
can't access /opt/amdgpu-pro/bin/amdgpu-pro-px
Looking for nvidia modules in /lib/modules/4.15.0-141-generic/updates/dkms
Error: can't open /lib/modules/4.15.0-141-generic/updates/dkms
Looking for amdgpu modules in /lib/modules/4.15.0-141-generic/updates/dkms
Error: can't open /lib/modules/4.15.0-141-generic/updates/dkms
Is nvidia loaded? no
Was nvidia unloaded? no
Is nvidia blacklisted? yes
Is intel loaded? yes
Is radeon loaded? no
Is radeon blacklisted? no
Is amdgpu loaded? no
Is amdgpu blacklisted? no
Is amdgpu versioned? no
Is amdgpu pro stack? no
Is nouveau loaded? no
Is nouveau blacklisted? yes
Is nvidia kernel module available? no
Is amdgpu kernel module available? no
Vendor/Device Id: 8086:191b
BusID "PCI:0@0:2:0"
Is boot vga? yes
Vendor/Device Id: 10de:139a
BusID "PCI:1@0:0:0"
can't open /sys/bus/pci/devices/0000:01:00.0/boot_vga
Is boot vga? no
Error: can't access /sys/bus/pci/devices/0000:01:00.0/driver
The device is not bound to any driver.
can't open /sys/bus/pci/devices/0000:01:00.0/boot_vga
can't access /etc/u-d-c-nvidia-runtimepm-override file
can't open /sys/module/nvidia/version
Warning: cannot check the NVIDIA driver major version
Support for runtimepm not detected.
You can override this check at your own risk by creating the /etc/u-d-c-nvidia-runtimepm-override file.
Is nvidia runtime pm supported for "0x139a"? no
Checking power status in /proc/driver/nvidia/gpus/0000:01:00.0/power
Error while opening /proc/driver/nvidia/gpus/0000:01:00.0/power
Is nvidia runtime pm enabled for "0x139a"? no
Skipping "/dev/dri/card0", driven by "i915"
Skipping "/dev/dri/card0", driven by "i915"
Skipping "/dev/dri/card0", driven by "i915"
Found "/dev/dri/card0", driven by "i915"
output 0:
card0-eDP-1
Number of connected outputs for /dev/dri/card0: 1
Does it require offloading? no
last cards number = 2
Has amd? no
Has intel? yes
Has nvidia? yes
How many cards? 2
Has the system changed? No
Intel IGP detected
Desktop system detected
or laptop with open drivers
Nothing to do
GPU管理员一通报告,我就注意到了一条可能有用的信息:
Is nvidia blacklisted? yes
屏蔽啦?屏蔽这活应该是modprobe干的,那就去检查下modprobe吧:
$ ls /lib/modprobe.d/
aliases.conf
blacklist_linux_4.15.0-137-generic.conf
blacklist_linux_4.15.0-141-generic.conf
blacklist-nvidia.conf
fbdev-blacklist.conf
nvidia-graphics-drivers.conf
systemd.conf
看到blacklist-nvidia.conf文件了吧,人赃俱获还真是modprobe干的.就这么顺利吗?现实情况是我第一次检查的/etc/modprobe.d文件夹,没有发现可疑文件,就放过modprobe了.好一通搜索无果后才有找到另一个巢穴/lib/modprobe.d文件夹,哎呦这小子啥时候还狡兔三窟了.
费了这么大劲找到了屏蔽nvidia gpu的配置文件,不得拉出来示个众:
cat /lib/modprobe.d/blacklist-nvidia.conf
# Do not modify
# This file was generated by nvidia-prime
blacklist nvidia
blacklist nvidia-drm
blacklist nvidia-modeset
alias nvidia off
alias nvidia-drm off
alias nvidia-modeset off
从注释信息看,这文件是nvidia-prime生成了,还真是干了事后留签名,敢干敢当.
删了吧:
rm blacklist-nvidia.conf
这里注意只删blacklist-nvidia.conf文件就可以了,不要把nvidia-graphics-drivers.conf文件也删了,虽然名字里都带nvidia.
安全期间再执行下:
sudo update-initramfs -u
重启.
这下终于可以成功召唤出这几年随着机器学习声名鹊起的NVIDIA了:
tianlang@tianlang:spark$ nvidia-smi
Sat Mar 27 07:27:19 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.67 Driver Version: 460.67 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce GTX 950M Off | 00000000:01:00.0 Off | N/A |
| N/A 49C P0 N/A / N/A | 0MiB / 2004MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
文章浏览阅读1k次。在调用门、中断门与陷阱门中,一旦出现权限切换,那么就会有堆栈的 ,切换。而且,由于CS的CPL发生改变,也导致了SS也必须要切换。切换时,会有新的ESP和SS(CS是由中断门或者调用门指定)这2个值从哪里来的呢?答案: TSS (Task-state segment ),任务状态段TSS就是一块内存,大小104个节不要把TSS与任务切换联系到一起TSS的意义就在于可以同时换掉一堆寄存器..._tss段
文章浏览阅读47次。配置文件在:/etc/php5/$中,不同的模式含有自己的php.ini配置文件。php可以运行于多种模式:cgi、fastcgi、cli、web模块模式等4种;我现在使用的模式是cli模式,这里进行一次测试。在ubuntu下需要安装sudo apt-get install php5-devphp应该是php5的链接。修改config.m4文件:..._php 简单项目 为学习用
文章浏览阅读1.2k次。得到焦点之前设置font-size:16像素。_ios浏览器小于15像素的时候会进行放大
文章浏览阅读520次。搭建过程网上很多, 主要是各个依赖的版本, 导致的各种 jar 包问题, 此处记录下我的 pom 和 yml 文件目录SpringAdmin server pom文件1. SpringAdmin server 配置(1) pom文件<parent> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-p..._spring boot admin client与spring boot admin server都配置spring-boot-starter-
文章浏览阅读1.4k次。python3中图像识别的应用open-CV库什么是open-CV?OpenCV是一个基于BSD许可(开源)发行的跨平台计算机视觉库,可以运行在Linux、Windows、Android和Mac OS操作系统上。它轻量级而且高效——由一系列 C 函数和少量 C++ 类构成,同时提供了Python、Ruby、MATLAB等语言的接口,实现了图像处理和计算机视觉方面的很多通用算法(百度百科)。代码:定义图像识别的类import cv2import osfrom PIL import ImageGr_python open-cv 小图搜大图
文章浏览阅读1w次,点赞2次,收藏19次。1.下载anaconda清华大学开源软件镜像站anaconda下载地址:https://mirrors.tuna.tsinghua.edu.cn/anaconda/archive/2.安装Anaconda$ sudo sh Anaconda3-5.3.0-Linux-x86_64.sh [sudo] andrew 的密码: Welcome to Anaconda3 5.3.0In o..._anaconda do you wish to process the
文章浏览阅读10w+次。安装使用 k8s 原生的 web图形化界面_kubernetes 2.x
文章浏览阅读7.7k次。_同步6进制计数器电路图
文章浏览阅读287次。目录一、Salt命令的构成1、target2、funcation3、arguments二、编写远程执行模块1、编写模块2、了解YAML语法3、SLS4、配置管理(1)方法一(2)方法二(3)方法三(4)一些例子一、Salt命令的构成Salt命令由三个主要部分构成:salt '<target>' <function> [arguments]1、targettarget: 指定哪些minion, 默认的规则是使用glob匹配minion id。salt '*' test._salt 远程执行命令
文章浏览阅读1.9k次,点赞3次,收藏10次。文章目录协方差协方差定义举例说明方差相关系数协方差矩阵(covariance matrix)举例说明数学符号表示协方差矩阵的应用马氏距离数学符号定义PCA降维使用sklearn中的np.cov遇到的坑协方差协方差定义协方差(Covariance)在概率论和统计学中用于衡量两个变量的总体误差。设有随机变量XXX和随机变量YYY,则协方差定义为:Cov(X,Y)=E((X−E[x])(Y−E[Y]))=E((Y−E[Y])(X−E[X]))Cov(X,Y)=E((X-E[x])(Y-E[Y]))_协方差矩阵的ρ
文章浏览阅读1.5k次,点赞2次,收藏4次。algorithmic和algorithmicx介绍下algorithmic和algorithmicx,这两个包很像,很多命令都是一样的,只是algorithmic的命令都是大写,algorithmicx的命令都是首字母大写,其他小写(EndFor两个大写)。下面是algorithmic的基本命令。\STATE <text>\IF{<condition>} \STATE{<text>} \ENDIF\FOR{<condition>} \STATE{_latex algorithm return
文章浏览阅读1.3k次。方式一:通过终端执行命令(适用于Linux操作系统)备份:将DATABASENAME数据库备份到/opt目录生成DATABASENAME.db备份文件mysqldump -uUSERNAME-pPASSWORD--routines --databases DATABASENAME> /opt/DATABASENAME.db登录MySQL:mysql -uUSERNAME -pPASSWORD删除数据库:drop database DATABASENAME;创建数据库:crea..._.nb3文件如何打开