[Linux] How to Resolve Graid Driver Failure After NVIDIA Upgrade#

[Linux] How to Resolve Graid Driver Failure After NVIDIA Upgrade#

Environment

  • RAID Model: SupremeRAID™ SR1000 / SR1010 / SR1001

  • Host Hardware: all server (x86, Intel/AMD platform)

  • Operating System: Linux

  • SupremeRAID™ Version: all

  • NVIDIA Driver : 570.124.04 (CUDA 12.8) [Default version]

  • NVIDIA Driver : 580.65.06 (CUDA 13.0), or other version 

Issue

After upgrading the NVIDIA driver from 570.124.04 to 580.65.06, the graid service failed to start.

Logs showed:


modprobe: ERROR: could not insert 'graid_nvidia': Invalid argument
graid.service: Failed with result 'exit-code'.

Resolution

  1. Cause:

    • NVIDIA driver upgrade introduced symbol version changes.

    • The graid-nvidia kernel module (compiled against driver 570) was not recompiled for the new driver (580).

    • As a result, the module could not resolve symbols and failed to load, preventing the graid service from starting.

  2. Fix Applied:
    Update /usr/bin/graid_server_pre.sh to include an auto-rebuild mechanism before loading the module:


if ! modprobe graid-nvidia 2>/dev/null; then

    versions=$(dkms status graid | grep -oP 'graid/\K[^,]+' | sort -u)

    for version in $versions; do

        dkms remove graid/$version --all

        dkms install graid/$version

    done

    modprobe graid-nvidia

fi

  1. Verification:

    • Restart graid and confirm service status:


systemctl restart graid

systemctl status graid

Confirm module load success:

lsmod | grep graid
  1. Preventive Measure:

    • Keep the auto-rebuild logic in place for all future NVIDIA driver upgrades.

    • Add monitoring/alerting for graid-nvidia load failures.


    • Related Articles

    • [Linux] OS booting got the error message after GPU DMA allocated

      Environment RAID Model: All Supreme RAID models Host Hardware: AMD/Intel Operating System: Linux SupremeRAID Driver: 1.3.x and later versions Description A known issue exists with the NVIDIA driver in older kernel versions, such as Ubuntu 20.04. ...
    • Graid Performance Benchmarking 2025 - Linux

      Environment RAID Model: All Host Hardware: AMD/Intel Operating System: Linux Storage Performance Testing on Linux This document provides quick and straightforward instructions for performing storage performance testing using the FIO benchmarking tool ...
    • [Linux] Controller Shows "MISSING" After License Application#

      Environment RAID Models: SupremeRAID SR-1000, SR-1010, SR-1001 Host Hardware: Any Operating System: Linux Issue On certain Linux systems, after applying the SupremeRAID license, the controller may be reported as "MISSING" in the output of graidctl or ...
    • Installation Guide for SupremeRAID driver

      Environment RAID Model: SR1000 or SR1010 etc Host Hardware: AMD/Intel/Supermicro model etc Operating System: Linux etc Issue Beginning using SupremeRAID Resolution Linux Step 1: Downloading the Required Scripts ...
    • MicroK8s GPU Validator CrashLoopBackOff when Using Graid (SupremeRAID) Cards

      Environment Kubernetes distribution: MicroK8s (with the gpu addon) GPU Management: NVIDIA Device Plugin for Kubernetes + NVIDIA GPU Operator Hardware: Systems with both Graid (SupremeRAID) cards and NVIDIA GPUs installed CUDA Toolkit version: 12.8.x ...