NVIDIA Maxwell & Kepler on Arch Linux
My workstation project now has a GeForce GTX Titan X and a Tesla K80 installed on an old z640 v1 board. The issue at this point is not with the Titan card, which has a maxwell architecture GPU chip; but rather with the Kepler type chips in the K80.
It seems that the latest Arch Linux distro nvidia driver; which at this point is 535.104.05, works fine with the titan, but support for data-center type GPU accelerator cards topped out at 460.106.00:
Nvidia driver compatibility page: https://www.nvidia.com/Download/index.aspx?lang=en-us
Data Center Driver For Linux X64
Version: | 460.106.00 | |
Release Date: | 2021.10.26 | |
Operating System: | Linux 64-bit | |
CUDA Toolkit: | 11.2 | |
Language: | English (US) | |
File Size: | 171.61 MB | |
Download |
Looking back into archive.archlinux.org, there are several versions of 460.67.
Instead of playing around with the bootloader, at least on the root partition; perhaps installing a data-development partition specifically set up for compatibility with older parts and systems using linux-lts or another more suitable kernel is better. This partition will have its own grub_uefi bootloader tailored to the requirements of setting up such a development environment on this and related hardware.
Here is a related post involving downgrading nvidia drivers for compatibility:
https://bbs.archlinux.org/viewtopic.php?id=265445
This post demonstrates setting the mirrorlist server to a specific date in the archives, then simply telling the package manager to perform a full synchronization.
Linux-lts kernel is being used, hopefully to prevent further complications and the ‘legacy’ window of compatibility so transient on modern distros. After researching some of the information available on the driver software provided by NVidia corporation, it has been noted that they are now providing a semi open-source version to linux developers. In Arch, this package is nvidia-open. I hope some good things come from this in the future.
The prevailing logic of the archive permits a directed date and time of historical compatibility as an edited mirrorlist directs. The appropriate time and date from the logic of highest version nvidia v460xx available on the archive, is nvidia-460.67-9-x86_64.pkg.tar.zst 14-Apr-2021 12:03 23M.
Thus, this file will be the setting within the lts kernel; for first pass experiment K & M data system.
Finding the appropriate method from several parallel source package configurations requires some graph traversal. 9-14-23
A message during PCI bus initialization indicates that Arch Linux’s nvidia470xx packages are sufficient for compatibility of the data-center card. I have tried this previously, however a failure to reach plymouth boot target discouraged a first attempt at this method.
Here is a comparison of the specs for both NVidia products:
- 560 MHz Core – Boostable to 876 MHz
- 4992 CUDA Cores
- 24GB GDDR5 vRAM
- 10 GHz Effective Memory Clock
- 384-Bit Memory Interface
- Kepler Architecture
- 5.6 TFLOPS Single Precision Processing
- Passive Heatsink Cooling
- PCI Express 3.0 x16 Interface
NVIDIA K80 Overview
Built on two Kepler GPUs, the NVIDIA Tesla K80 GPU Accelerator is intended for use in servers and supercomputers. The K80 is a dual GPU unit which utilizes two GK210B chipsets. As a unit this card offers a total of 4992 CUDA cores clocked at 560 MHz coupled to 24GB of GDDR5 vRAM with a 384-bit memory interface and a 480 GB/s bandwidth. This card features no outputs for displays, it is designed to serve as a hardware accelerator using the CUDA cores in the GPU. Its CUDA cores are arranged using NVIDIA’s Kepler architecture. Using compatible APIs, software can leverage the massively parallel processing abilities. Projects that can take advantage of the parallel processing include computational fluid dynamics and structural mechanics, numeral analytics, and molecular dynamics.
While the GPU by default runs at 560 MHz, it can be boosted up to 875 MHz to greatly increase processing power. This boost is initiated when the card is under load and there is enough temperature overhead to allow the card to use more power. Because this card is built to be used in servers, it features passive heatsink cooling which relies on the airflow and cooling present in a server structure. The passive heatsink is silent and increases reliability because it contains no moving parts.
GTX TITAN X Engine Specs:
3072 CUDA Cores
1000 Base Clock (MHz)
1075 Boost Clock (MHz)
192 Texture Fill Rate (GigaTexels/sec)
GTX TITAN X Memory Specs:
7.0 Gbps Memory Clock
12 GB Standard Memory Config
GDDR5 Memory Interface
384-bit Memory Interface Width
336.5 Memory Bandwidth (GB/sec)
GTX TITAN X Technology Support:
Yes (4-way)NVIDIA SLI® Ready
Yes NVIDIA G-Sync™-Ready
Yes NVIDIA GameStream™-Ready
Yes GeForce ShadowPlay™
2.0 NVIDIA GPU Boost™
Yes Dynamic Super Resolution
Yes MFAA
Yes NVIDIA GameWorks™
12 API with Feature Level 12.1Microsoft DirectX
4.5 OpenGL
Yes CUDA
PCI Express 3.0Bus Support
Windows 8 & 8.1, Windows 7, Windows Vista, Linux, FreeBSD x86OS Certification
Display Support:
5120x3200Maximum Digital Resolution*
2048x1536Maximum VGA Resolution
Dual Link DVI-I, HDMI 2.0, 3x DisplayPort 1.2Standard Display Connectors
4 displaysMulti Monitor
Yes HDCP
Internal Audio Input for HDMI
GTX TITAN X Graphics Card Dimensions:
4.376 inches Height
10.5 inches Length
Dual-width Width
Thermal and Power Specs:
91 C Maximum GPU Tempurature (in C)
250 W Graphics Card Power (W)
600 W Recommended System Power (W)**
6-pin + 8-pin Supplementary Power Connectors
Maxwell Architecture – GP102 graphics core
NVIDIA GPU Specification Comparison | ||||
GTX TITAN X (Pascal) | GTX 1080 | GTX 1070 | GTX TITAN X (Maxwell) | |
CUDA Cores | 3584 | 2560 | 1920 | 3072 |
Core Clock | 1417MHz | 1607MHz | 1506MHz | 1000MHz |
Boost Clock | 1531MHz | 1733MHz | 1683MHz | 1075MHz |
TFLOPs (FMA) | 11 TFLOPs | 9 TFLOPs | 6.5 TFLOPs | 6.6 TFLOPs |
Memory Clock | 10Gbps GDDR5X | 10Gbps GDDR5X | 8Gbps GDDR5 | 7Gbps GDDR5 |
Memory Bus Width | 384-bit | 256-bit | 256-bit | 384-bit |
VRAM | 12GB GDDR5X | 8GB GDDR5X | 8GB GDDR5 | 12GB GDDR5 |
TDP | 250W | 180W | 150W | 250W |
GPU | GP102 | GP104 | GP104 | GM200 |
Transistor Count | 12B | 7.2B | 7.2B | 8B |
Considering the specific requirements of the old data-center cards, it seems more logical to consider a dual Titan X configuration. Let’s look at some basic spec comparisons and see how 1 K80 compares to 2 Titan X’s.
K80 | Zotac GeForce GTX Titan X (Maxwell) | |
Memory Clock | 10Ghz | 7Ghz |
CUDA | 4992 | 3072 x 2 = 6144 |
VRam | 24G DDR4 | 12G DDR5 x 2 = 24G DDR5 |
Base Clock | 560Mhz | 1000Mhz |
TFlops | 5.6 | 6.6 |
It makes more sense to design a dedicated server to several such cards, unmodded as active cooling will be present; the kernel panics and system modification required from base nightly in Arch, etc. will be solved.
Cheers. Excellent stuff!