Environment
RAID Model: SR1000 / SR1001 / SR1010
Host Hardware: Multi-GPU system (e.g., NVIDIA A-series, with or without PCIe switch such as PLX/PEX)
Operating System: RHEL / Rocky / AlmaLinux, Ubuntu / Debian (UEFI boot)
Software: SupremeRAID Pre-installer (v1.7.x+)
After installing the SupremeRAID pre-installer on a multi-GPU system, the RAID card is not detected by nvidia-smi
.
When checking kernel logs (dmesg
), the following error messages appear:
This issue typically occurs during the boot sequence, before the NVIDIA or SupremeRAID drivers are fully initialized.
On systems with multiple GPUs or complex PCIe topologies, the Linux kernel may reallocate PCI resources during boot.
When PCI reallocation is enabled, certain devices may receive incorrect or zero-sized BAR assignments (e.g., BAR0 = 0M), causing the NVIDIA probe routine to fail.
Because the SupremeRAID controller relies on proper PCIe enumeration, this failure can also prevent the RAID card from being detected.
Disable PCI resource reallocation during boot by adding the following parameter to the GRUB configuration:
This preserves the original BAR assignments made by the system firmware (BIOS/UEFI) and prevents the kernel from overwriting them.
Edit the GRUB configuration:
Add pci=realloc=off
to the GRUB_CMDLINE_LINUX
line (append to existing parameters):
Update GRUB:
Reboot the system:
Edit the GRUB configuration:
Add pci=realloc=off
to the kernel command line:
Rebuild the GRUB configuration (UEFI examples):
(For BIOS mode, use /boot/grub2/grub.cfg
.)
Reboot:
After rebooting:
Check dmesg
to confirm the BAR0 error no longer appears.
Run nvidia-smi
to ensure all GPUs are detected without probe failures.
Verify SupremeRAID controller detection:
If the issue persists, also verify:
Above 4G Decoding and Resizable BAR are enabled in BIOS.
PCIe slots and switch topology are correctly configured.
IOMMU settings are not causing conflicts.
The pci=realloc=off
parameter prevents the kernel from reallocating PCI resources. This is generally safe for systems where firmware already assigns resources correctly (typical in GPU + RAID configurations).
On systems with many hot-plug devices or dynamic PCI topologies, this setting may limit some flexibility.
If you add or move devices later, re-evaluate BIOS and kernel parameters accordingly.