Aki has finally saved some money. After thinking about it for a long time, I can finally make some small moves with AI drawing.
In line with the principle that one should not speak recklessly (a pencil is worth its weight in gold) and that money should not be spent carelessly, I considered purchasing a "high-performance" (outdated) graphics card to play with Stable Diffusion (hereinafter referred to as SD).
Originally, the graphics card was intended for the server, but in the end, I found it wouldn't fit, which was quite awkward.
I installed Ubuntu 22.04 on my PC, and the original graphics card was a 1660TI, so now I can free up space for the new graphics card.
Core Components | Before Replacement | After Replacement |
---|---|---|
CPU | AMD Ryzen 5 3500X | Unchanged |
Motherboard | MSI B450M | Unchanged |
Memory | 40G | Unchanged |
Graphics Card 1 | NVIDIA GTX 1660TI | TESLA P40 |
Graphics Card 2 | None | NVIDIA GF 310 |
The GF 310 is only used as a display card. The rated power required for these components before replacement (not considering overclocking) was about 215W, and after replacement, the power increased to 335W (a partner of the State Grid and peripheral employees), resulting in a noticeable increase in electricity costs.
Why Choose Tesla P40#
First, let's take a look at the technical specifications of this card.
GPU Architecture | NVIDIA Pascal™ |
Single Precision Floating Point Performance | 12 TeraFLOPS* |
Integer Performance (INT8) | 47 TOPS* (trillions of operations/second) |
GPU Memory | 24 GB |
Memory Bandwidth | 346 GB/s |
System Interface | PCI Express 3.0 x16 |
Dimensions | 4.4” (H) x 10.5” (L), dual-slot, full height |
Maximum Power Consumption | 250 W |
Page Migration Engine Enhanced Programming Capability | Yes |
ECC Protection | Yes |
Hardware Accelerated Video Engine | 1 decoding engine, 2 encoding engines |
The actual performance of the P40 is equivalent to that of the GTX 1080TI, and netizens have made detailed comparisons, see: Zhihu.
That said, why must I choose the P40?
Aside from the technical specifications like single precision floating point performance and integer performance that I don't understand, I actually value its 24GB of memory more. This Zhihu answer was from 2018, and the P40 was first released in 2016. Undoubtedly, the high price of such a professional computing card is inevitable. Time has passed, and now in the second-hand market on a certain fish platform, the comparison shows that the 1080TI is slightly more expensive (this is the result of multiple iterations of AI models in the first half of the year leading to price increases for training-use graphics cards in the second-hand market).
The 24GB memory of the P40 compared to the 12GB of the 1080TI can run more types of AI models.
Additionally, the server itself has integrated graphics, so there is no need to consider display issues, which is why I chose the P40 (950R).
Some Operations Before and After Installing P40#
The correct steps to install the graphics card driver on Ubuntu should first disable the open-source driver nouveau and enter init 3 pure command line mode.
Since I had done these steps earlier when installing the driver for the 1660TI, this article will skip them.
If any friends refer to this article to install drivers for the Linux operating system, be sure to prepare as mentioned above.
Since this PC originally had the 1660TI, after replacing it, there was no output display, so I needed to prepare another display card. I hurriedly went to a nearby computer store and spent a "high price" to buy a GF310 graphics card as a display card (otherwise, it wouldn't light up and wouldn't pass the self-check).
On the Ubuntu operating system, I couldn't install the P40 graphics card yet; I only installed the GF310, then uninstalled the original driver and updated to install the new driver.
~$ lspci | grep -i nvidia
25:00.0 VGA compatible controller: NVIDIA Corporation GT218 [GeForce 310] (rev a2)
~$ apt remove --purge nvidia-*
~$ nvidia-smi
The command "nvidia-smi" could not be found, but it can be installed through the following package:
~$ apt autoremove
lspci
is used to check the currently installed graphics card model on PCI.
apt remove --purge nvidia-*
is used to uninstall the original 1660TI driver.
If nvidia-smi
prompts that the command cannot be found, it indicates that the driver has been uninstalled.
apt autoremove
ensures that residual dependency packages are completely uninstalled.
The driver required for the GF310 graphics card is version 340, and using the ubuntu-drivers devices
command could not find a suitable driver version. Attempts to install the driver file downloaded from the official website also failed (this version must be very old).
The reason is that the ppa source does not contain the 340 version of the driver package, so it needs to be expanded.
~$ add-apt-repository ppa:graphics-drivers/ppa
~$ apt-get update
~$ apt-get install nvidia-340
After installation, rebooting still did not allow me to use the nvidia-smi command to check the existing graphics card driver.
~$ apt-get install nvidia-340
Reading package lists... Done
Analyzing the dependency tree... Done
Reading state information... Done
nvidia-340 is already the newest version (340.108-0ubuntu8).
0 upgraded, 0 newly installed, 0 to remove, and 2 not upgraded.
However, re-executing the apt-get install nvidia-340
command prompts that it is already installed, so the driver for the display card should be installed, and the display resolution has automatically adjusted, resolving the display issue.
Before installing the P40 graphics card, you must enter the BIOS to enable Above4GDecoding and disable secure boot.
After shutting down again and reinserting the P40 graphics card, I powered it on.
This time, I could see the P40 graphics card displayed on the PCI channel.
~$ lspci | grep -i nvidia
25:00.0 VGA compatible controller: NVIDIA Corporation GT218 [GeForce 310] (rev a2)
25:00.1 Audio device: NVIDIA Corporation High Definition Audio Controller (rev a1)
26:00.0 3D controller: NVIDIA Corporation GP102GL [Tesla P40] (rev a1)
Since I updated the ppa source, the next step of installing the driver became much simpler.
~$ ubuntu-drivers autoinstall
After installation, rebooting again allows me to see the graphics card information.
~$ nvidia-smi
Wed Oct 25 16:55:35 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.23.06 Driver Version: 545.23.06 CUDA Version: 12.3 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 Tesla P40 Off | 00000000:26:00.0 Off | Off |
| N/A 41C P8 12W / 250W | 0MiB / 24576MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
At this point, I can see that the P40 graphics card is off on the right side, corresponding to the Persistence-M mode, which is off by default. Enabling persistence mode allows the GPU to respond to tasks more quickly, but standby power consumption will increase.
To allow the graphics card to respond quickly to calculations, it should be enabled.
~$ nvidia-smi -pm 1
Enabled persistence mode for GPU 00000000:26:00.0.
All done.
~$ nvidia-smi
Wed Oct 25 16:55:35 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.23.06 Driver Version: 545.23.06 CUDA Version: 12.3 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 Tesla P40 On | 00000000:26:00.0 Off | Off |
| N/A 41C P8 12W / 250W | 0MiB / 24576MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
Above, there is also a CUDA Version, which shows the CUDA version that can be installed on this machine, and does not mean that it has been installed. To confirm whether it is installed, just check the /usr/local directory for any CUDA-related files; if not, it indicates that it is not installed.
Install CUDA#
~$ wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb
~$ dpkg -i cuda-keyring_1.1-1_all.deb
~$ apt-get update
~$ apt-get -y install cuda
After installation, you can see a similar prompt.
*****************************************************************************
*** Reboot your computer and verify that the NVIDIA graphics driver can ***
*** be loaded. ***
*****************************************************************************
~$ cd /usr/local
~$ ls
bin cuda cuda-12 cuda-12.3 etc games include lib man sbin share src
This means that after rebooting, CUDA can be applied, and switching to the /usr/local directory shows the CUDA directory.
No rush, let's do the relevant configuration before rebooting.
~$ vim ~/.bashrc
# Add the following content, pay attention to the version number
export PATH=/usr/local/cuda-12.3/bin${PATH:+:${PATH}}
After completing this, reboot.
It was mentioned earlier that the GF310 graphics card could not be displayed in nvidia-smi, which is suspected to be due to version incompatibility with the existing nvidia-smi, but it can light up.
Because it cannot display whether the driver is normal, and there is no relevant hardware information in the /dev directory, there is still a bit of a problem. I am considering the next step to replace it with a GTX 750TI or use an external graphics dock with a newer graphics card that can be recognized by nvidia-smi as a display card.
That's a topic for another time.
Deploy Stable Diffusion#
git clone https://github.com/AbdBarho/stable-diffusion-webui-docker.git
cd stable-diffusion-webui-docker
# Install Docker graphics dependencies
apt install nvidia-docker2 nvidia-container-toolkit nvidia-container-runtime
# Install dependency packages, which will automatically download the Stable Diffusion v1.5 model.
docker compose --profile download up --build
# Start the container, select auto to start the WebUI developed by AUTOMATIC1111
docker compose --profile auto up --build
After starting, you can access the SD web page via http://ip:7860, which by default only has the Stable Diffusion v1.5 model.
Try drawing.
Prompt (sourced from the internet):
beautiful render of a Tudor style house near the water at sunset, fantasy forest. photorealistic, cinematic composition, cinematic high detail, ultra realistic, cinematic lighting, Depth of Field, hyper-detailed, beautifully color-coded, 8k, many details, chiaroscuro lighting, ++dreamlike, vignette
The actual image generation takes about 3-5 seconds. It can be seen that this graphics card is still quite useful.
During the image generation process, changes in the graphics card can be observed.
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.23.06 Driver Version: 545.23.06 CUDA Version: 12.3 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 Tesla P40 On | 00000000:26:00.0 Off | Off |
| N/A 50C P0 54W / 250W | 210MiB / 24576MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 32634 C python 208MiB |
+---------------------------------------------------------------------------------------+
It can be said that it is quite easy (without changing parameters, of course).
References#
[1] Tesla P40 Technical Specifications
[2] Tesla P40 Release Information
[3] Comparison of Tesla P40 and GTX 1080TI