[Linux] Controller Shows "MISSING" After License Application#

[Linux] Controller Shows "MISSING" After License Application#

Environment

  • RAID Models: SupremeRAID SR-1000, SR-1010, SR-1001

  • Host Hardware: Any

  • Operating System: Linux


Issue

On certain Linux systems, after applying the SupremeRAID license, the controller may be reported as "MISSING" in the output of graidctl or in graid_server logs.

You may observe log messages similar to:

You may observe log messages similar to:


[graid_server] [error] Controller0: Failed to create gdev, Failed to create gdev, status = 13 [graid_server] [error] Controller0: bind gpu error -13
[graid_core0] [error] GPUDevice: Failed to set current CUDA device, initialization error

This behavior typically occurs immediately after license activation and prevents normal RAID operation(Error message shows "There is no available controller").


Cause

This issue is related to a known compatibility problem with the NVIDIA driver, specifically the Heterogeneous Memory Management (HMM) feature. In affected systems, the NVIDIA UVM driver (nvidia-uvm) can interfere with device memory mapping required by SupremeRAID, leading to the controller being incorrectly marked as missing.

NVIDIA has documented this issue in their CUDA 12.8.1 release notes:
👉 NVIDIA CUDA Toolkit Release Notes 12.8.1 – Known Issues



Resolution

Both of the following workarounds are effective. You may choose either option depending on your system requirements.

Option 1: Disable KASLR 

  1. Edit GRUB configuration:

    sudo vim /etc/default/grub
  2. Add nokaslr to the GRUB_CMDLINE_LINUX_DEFAULT line:

    GRUB_CMDLINE_LINUX_DEFAULT="quiet splash nokaslr"
  3. Update GRUB and reboot:

    sudo update-grub sudo reboot

Option 2: Disable HMM for NVIDIA UVM Module

  1. Create or edit the module configuration file:

    sudo vim /etc/modprobe.d/nvidia-uvm.conf
  2. Add the following line:

    options nvidia-uvm uvm_disable_hmm=1
  3. Regenerate initramfs (optional but recommended):

    sudo dracut -f
  4. Reboot the system:

    sudo reboot
  5. After reboot, verify the parameter is applied:

    cat /sys/module/nvidia_uvm/parameters/uvm_disable_hmm # Should return: Y or 1

Additional Notes

  • Disabling KASLR has minimal performance or security impact in controlled environments and is often more compatible across various NVIDIA driver versions.

  • Disabling HMM may affect CUDA workloads that depend on unified memory. Choose based on your GPU application requirements.