MLNX OFED 是 NVIDIA 提供的高性能网络驱动与通信栈,支持 InfiniBand 与 RoCE,广泛用于 HPC、AI 训练和数据中心低时延高速通信环境。
Mellanox OFED#
基础环境:
- Ubuntu: 22.04.5
- 内核: 5.15.0-119-generic
- Mellanox: MLNX_OFED_LINUX-23.10-1.1.9.0-ubuntu22.04-x86_64
一、基础环境#
1.1 配置源#
- 备份原有 sources.list(若已存在则不重复覆盖)
[ -f /etc/apt/sources.list ] && cp -n /etc/apt/sources.list /etc/apt/sources.list.bak
- 写入阿里云 Ubuntu 22.04 (jammy) 镜像源
cat <<'EOF' > /etc/apt/sources.list
deb https://mirrors.aliyun.com/ubuntu/ jammy main restricted universe multiverse
deb-src https://mirrors.aliyun.com/ubuntu/ jammy main restricted universe multiverse
deb https://mirrors.aliyun.com/ubuntu/ jammy-security main restricted universe multiverse
deb-src https://mirrors.aliyun.com/ubuntu/ jammy-security main restricted universe multiverse
deb https://mirrors.aliyun.com/ubuntu/ jammy-updates main restricted universe multiverse
deb-src https://mirrors.aliyun.com/ubuntu/ jammy-updates main restricted universe multiverse
# deb https://mirrors.aliyun.com/ubuntu/ jammy-proposed main restricted universe multiverse
# deb-src https://mirrors.aliyun.com/ubuntu/ jammy-proposed main restricted universe multiverse
deb https://mirrors.aliyun.com/ubuntu/ jammy-backports main restricted universe multiverse
deb-src https://mirrors.aliyun.com/ubuntu/ jammy-backports main restricted universe multiverse
EOF
- 更新源
sudo apt update
1.2 内核包#
注意: 内核相关的包一定要保持与当前系统内核版本号一致。
apt install linux-image-5.15.0-119-generic linux-headers-5.15.0-119-generic linux-tools-5.15.0-119-generic linux-cloud-tools-5.15.0-119-generic
二、Mellanox NIC#
- 下载(根据需求下载指定系统、版本的驱动包)
MLNX_OFED: MLNX_OFED Download Center
- 安装
# 解压安装
tar xf MLNX_OFED_LINUX-23.10-1.1.9.0-ubuntu22.04-x86_64.tgz && cd MLNX_OFED_LINUX-23.10-1.1.9.0-ubuntu22.04-x86_64
# 交互安装;选择 Y 继续安装
./mlnxofedinstall --all
# 强制安装
# ./mlnxofedinstall --all --force
- 输出信息
root@ubuntu:~/MLNX_OFED_LINUX-23.10-1.1.9.0-ubuntu22.04-x86_64# ./mlnxofedinstall --all --force
Logs dir: /tmp/MLNX_OFED_LINUX.1730.logs
General log file: /tmp/MLNX_OFED_LINUX.1730.logs/general.log
Below is the list of MLNX_OFED_LINUX packages that you have chosen
(some may have been added by the installer due to package dependencies):
ofed-scripts
mlnx-tools
mlnx-ofed-kernel-utils
mlnx-ofed-kernel-dkms
iser-dkms
isert-dkms
srp-dkms
rdma-core
libibverbs1
ibverbs-utils
ibverbs-providers
libibverbs-dev
libibverbs1-dbg
libibumad3
libibumad-dev
ibacm
librdmacm1
rdmacm-utils
librdmacm-dev
mstflint
ibdump
libibmad5
libibmad-dev
libopensm
opensm
opensm-doc
libopensm-devel
libibnetdisc5
infiniband-diags
mft
kernel-mft-dkms
perftest
ibutils2
ibsim
ibsim-doc
ucx
sharp
hcoll
knem-dkms
knem
openmpi
mpitests
dpcp
srptools
mlnx-ethtool
mlnx-iproute2
rshim
ibarr
This program will install the MLNX_OFED_LINUX package on your machine.
Note that all other Mellanox, OEM, OFED, RDMA or Distribution IB packages will be removed.
Those packages are removed due to conflicts with MLNX_OFED_LINUX, do not reinstall them.
Checking SW Requirements...
One or more required packages for installing MLNX_OFED_LINUX are missing.
Attempting to install the following missing packages:
swig libgfortran5 gcc automake libnl-3-dev libltdl-dev pkg-config flex graphviz libnl-route-3-dev tk m4 libnl-route-3-200 autoconf make bison debhelper dkms quilt autotools-dev libc6-dev libfuse2 gfortran chrpath
Removing old packages...
Installing new packages
Installing ofed-scripts-23.10.OFED.23.10.1.1.9...
Installing mlnx-tools-23.10.0...
Installing mlnx-ofed-kernel-utils-23.10.OFED.23.10.1.1.9.1...
Installing mlnx-ofed-kernel-dkms-23.10.OFED.23.10.1.1.9.1...
Installing iser-dkms-23.10.OFED.23.10.1.1.9.1...
Installing isert-dkms-23.10.OFED.23.10.1.1.9.1...
Installing srp-dkms-23.10.OFED.23.10.1.1.9.1...
Installing rdma-core-2307mlnx47...
Installing libibverbs1-2307mlnx47...
Installing ibverbs-utils-2307mlnx47...
Installing ibverbs-providers-2307mlnx47...
Installing libibverbs-dev-2307mlnx47...
Installing libibverbs1-dbg-2307mlnx47...
Installing libibumad3-2307mlnx47...
Installing libibumad-dev-2307mlnx47...
Installing ibacm-2307mlnx47...
Installing librdmacm1-2307mlnx47...
Installing rdmacm-utils-2307mlnx47...
Installing librdmacm-dev-2307mlnx47...
Installing mstflint-4.16.1...
Installing ibdump-6.0.0...
Installing libibmad5-2307mlnx47...
Installing libibmad-dev-2307mlnx47...
Installing libopensm-5.17.0.MLNX20231105.d437ae0a...
Installing opensm-5.17.0.MLNX20231105.d437ae0a...
Installing opensm-doc-5.17.0.MLNX20231105.d437ae0a...
Installing libopensm-devel-5.17.0.MLNX20231105.d437ae0a...
Installing libibnetdisc5-2307mlnx47...
Installing infiniband-diags-2307mlnx47...
Installing mft-4.26.1...
Installing kernel-mft-dkms-4.26.1.3...
Installing perftest-23.10.0...
Installing ibutils2-2.1.1...
Installing ibsim-0.12...
Installing ibsim-doc-0.12...
Installing ucx-1.16.0...
Installing sharp-3.5.1.MLNX20231116.7fcef5af...
Installing hcoll-4.8.3223...
Installing knem-dkms-1.1.4.90mlnx3...
Installing knem-1.1.4.90mlnx3...
Installing openmpi-4.1.7a1...
Installing mpitests-3.2.21...
Installing dpcp-1.1.43...
Installing srptools-2307mlnx47...
Installing mlnx-ethtool-6.4...
Installing mlnx-iproute2-6.4.0...
Installing rshim-2.0.17...
Installing ibarr-0.1.3...
Selecting previously unselected package mlnx-fw-updater.
(Reading database ... 90967 files and directories currently installed.)
Preparing to unpack .../mlnx-fw-updater_23.10-1.1.9.0_amd64.deb ...
Unpacking mlnx-fw-updater (23.10-1.1.9.0) ...
Setting up mlnx-fw-updater (23.10-1.1.9.0) ...
Added 'RUN_FW_UPDATER_ONBOOT=no to /etc/infiniband/openib.conf
Initializing...
Attempting to perform Firmware update...
No devices found!
Installation passed successfully
To load the new driver, run:
/etc/init.d/openibd restart
- 启动服务
systemctl enable openibd.service
systemctl start openibd.service
- 安装服务
- perftest: /usr/bin
- openmpi: /usr/mpi/gcc/openmpi-4.1.7a1
结语#
参考:



