[Linux] How to Resolve Graid Driver Failure After NVIDIA Upgrade

[Linux] How to Resolve Graid Driver Failure After NVIDIA Upgrade

Environment

  • RAID Model: SupremeRAID™ SR1000 / SR1010 / SR1001

  • Host Hardware: all server (x86, Intel/AMD platform)

  • Operating System: Linux

  • SupremeRAID™ Version: all

  • NVIDIA Driver : 570.124.04 (CUDA 12.8) [Default version]

  • NVIDIA Driver : 580.65.06 (CUDA 13.0), or other version 

Issue

After upgrading the NVIDIA driver from 570.124.04 to 580.65.06, the graid service failed to start.

Logs showed:


modprobe: ERROR: could not insert 'graid_nvidia': Invalid argument
graid.service: Failed with result 'exit-code'.

Resolution

  1. Cause:

    • NVIDIA driver upgrade introduced symbol version changes.

    • The graid-nvidia kernel module (compiled against driver 570) was not recompiled for the new driver (580).

    • As a result, the module could not resolve symbols and failed to load, preventing the graid service from starting.

  2. Fix Applied:
    Update /usr/bin/graid_server_pre.sh to include an auto-rebuild mechanism before loading the module:


if ! modprobe graid-nvidia 2>/dev/null; then

    versions=$(dkms status graid | grep -oP 'graid/\K[^,]+' | sort -u)

    for version in $versions; do

        dkms remove graid/$version --all

        dkms install graid/$version

    done

    modprobe graid-nvidia

fi

  1. Verification:

    • Restart graid and confirm service status:


systemctl restart graid

systemctl status graid

Confirm module load success:

lsmod | grep graid
  1. Preventive Measure:

    • Keep the auto-rebuild logic in place for all future NVIDIA driver upgrades.

    • Add monitoring/alerting for graid-nvidia load failures.


    • Related Articles

    • [Linux] OS booting got the error message after GPU DMA allocated

      Environment RAID Model: All Supreme RAID models Host Hardware: AMD/Intel Operating System: Linux SupremeRAID Driver: 1.3.x and later versions Description A known issue exists with the NVIDIA driver in older kernel versions, such as Ubuntu 20.04. ...
    • Graid Performance Benchmarking 2025 - Linux

      Environment RAID Model: All Host Hardware: AMD/Intel Operating System: Linux Storage Performance Testing on Linux This document provides quick and straightforward instructions for performing storage performance testing using the FIO benchmarking tool ...
    • Installation Guide for SupremeRAID driver

      Environment RAID Model: SR1000 or SR1010 etc Host Hardware: AMD/Intel/Supermicro model etc Operating System: Linux etc Issue Beginning using SupremeRAID Resolution Linux Step 1: Downloading the Required Scripts ...
    • Offline install SupremeRAID driver#

      Environment RAID Model: SR1000 or SR1010, SR1001 Host Hardware: Intel, AMD Operating System: Linux RPM base(Alamlinux, CentOS, RHEL, Rocky, Oracle Linux) Issue When users need to install the SupremRAID driver but cannot connect to the external ...
    • Offline install SupremeRAID driver(Ubuntu)#

      Environment RAID Model: SR1000 or SR1010, SR1001 Host Hardware: Intel, AMD Operating System: Linux Deb base(Ubuntu) Issue When users need to install the SupremRAID driver but cannot connect to the external network. Resolution Prepare an environment ...