Notes about results of ibv_generate_infiniband_test_load.
Dual 10 GbE ports connected to a switch. AlmaLinux release 8.10 with Kernel 4.18.0-553.104.1.el8_10.x86_64.
Using the initial code which was fixed to use GID index 0, which is RoCEv1.
Initially the active MTU set to 1024:
$ ibv_devinfo
hca_id: rocep33s0f0
transport: InfiniBand (0)
fw_ver: 14.32.1010
node_guid: 9803:9b03:0077:e152
sys_image_guid: 9803:9b03:0077:e152
vendor_id: 0x02c9
vendor_part_id: 4117
hw_ver: 0x0
board_id: MT_2420110004
phys_port_cnt: 1
port: 1
state: PORT_ACTIVE (4)
max_mtu: 4096 (5)
active_mtu: 1024 (3)
sm_lid: 0
port_lid: 0
port_lmc: 0x00
link_layer: Ethernet
hca_id: rocep33s0f1
transport: InfiniBand (0)
fw_ver: 14.32.1010
node_guid: 9803:9b03:0077:e153
sys_image_guid: 9803:9b03:0077:e152
vendor_id: 0x02c9
vendor_part_id: 4117
hw_ver: 0x0
board_id: MT_2420110004
phys_port_cnt: 1
port: 1
state: PORT_ACTIVE (4)
max_mtu: 4096 (5)
active_mtu: 1024 (3)
sm_lid: 0
port_lid: 0
port_lmc: 0x00
link_layer: Ethernet
Results:
[mr_halfword@skylake-alma release]$ ibv_generate_infiniband_test_load/ibv_generate_infiniband_test_load 0
PRBS32 pattern period is 4294967295
Press Ctrl-C to stop the RDMA test load
^C
rocep33s0f1 port 1 rx_buffer compare : PASS
rocep33s0f0 port 1 -> rocep33s0f1 port 1 RDMA write transmitted 525059751936 bytes in 462.273715 seconds, 1135.8 Mbytes/sec
rocep33s0f0 port 1 transmitted 563887357956 bytes in 462.273715 seconds, 1219.8 Mbytes/sec
rocep33s0f0 port 1 received 564025459276 bytes in 462.273715 seconds, 1220.1 Mbytes/sec
rocep33s0f0 port 1 rx_buffer compare : PASS
rocep33s0f1 port 1 -> rocep33s0f0 port 1 RDMA write transmitted 525059751936 bytes in 462.241296 seconds, 1135.9 Mbytes/sec
rocep33s0f1 port 1 transmitted 564025556140 bytes in 462.241296 seconds, 1220.2 Mbytes/sec
rocep33s0f1 port 1 received 563887357956 bytes in 462.241296 seconds, 1219.9 Mbytes/sec
Set the Ethernet device MTU to 9600 bytes:
[mr_halfword@skylake-alma release]$ sudo ip link set ens1f0 mtu 9600
[sudo] password for mr_halfword:
[mr_halfword@skylake-alma release]$ sudo ip link set ens1f1 mtu 9600
RoCE active MTU then increases to 4096:
$ ibv_devinfo
hca_id: rocep33s0f0
transport: InfiniBand (0)
fw_ver: 14.32.1010
node_guid: 9803:9b03:0077:e152
sys_image_guid: 9803:9b03:0077:e152
vendor_id: 0x02c9
vendor_part_id: 4117
hw_ver: 0x0
board_id: MT_2420110004
phys_port_cnt: 1
port: 1
state: PORT_ACTIVE (4)
max_mtu: 4096 (5)
active_mtu: 4096 (5)
sm_lid: 0
port_lid: 0
port_lmc: 0x00
link_layer: Ethernet
hca_id: rocep33s0f1
transport: InfiniBand (0)
fw_ver: 14.32.1010
node_guid: 9803:9b03:0077:e153
sys_image_guid: 9803:9b03:0077:e152
vendor_id: 0x02c9
vendor_part_id: 4117
hw_ver: 0x0
board_id: MT_2420110004
phys_port_cnt: 1
port: 1
state: PORT_ACTIVE (4)
max_mtu: 4096 (5)
active_mtu: 4096 (5)
sm_lid: 0
port_lid: 0
port_lmc: 0x00
link_layer: Ethernet
Results:
[mr_halfword@skylake-alma release]$ ibv_generate_infiniband_test_load/ibv_generate_infiniband_test_load 0
PRBS32 pattern period is 4294967295
Press Ctrl-C to stop the RDMA test load
^C
rocep33s0f1 port 1 rx_buffer compare : PASS
rocep33s0f0 port 1 -> rocep33s0f1 port 1 RDMA write transmitted 6019128229888 bytes in 4983.480160 seconds, 1207.8 Mbytes/sec
rocep33s0f0 port 1 transmitted 6176364832464 bytes in 4983.480160 seconds, 1239.4 Mbytes/sec
rocep33s0f0 port 1 received 6176369936084 bytes in 4983.480160 seconds, 1239.4 Mbytes/sec
rocep33s0f0 port 1 rx_buffer compare : PASS
rocep33s0f1 port 1 -> rocep33s0f0 port 1 RDMA write transmitted 6019128229888 bytes in 4983.495318 seconds, 1207.8 Mbytes/sec
rocep33s0f1 port 1 transmitted 6176369977964 bytes in 4983.495318 seconds, 1239.4 Mbytes/sec
rocep33s0f1 port 1 received 6176364832464 bytes in 4983.495318 seconds, 1239.4 Mbytes/sec
Start with the Ethernet MTU 1500 and the RDMA MTU 1024
InitiaL attempt to run using a RoVE v2 GID failed:
[mr_halfword@skylake-alma release]$ ibv_generate_infiniband_test_load/ibv_generate_infiniband_test_load 0 1
Assertion failed : ibv_modify_qp() IBV_QPS_RTR for rocep33s0f1 port 1 GID index 1 type RoCE V2 failed with Connection timed out
ip addr showed the IPv6 link-local scope addressed differed from that assigned for the GIDs.
Manually added IPv6 addresses to the ihterfaces:
[mr_halfword@skylake-alma ~]$ sudo ip -6 addr add fe80:0000:0000:0000:9a03:9bff:fe77:e152/64 scope link dev ens1f0
[sudo] password for mr_halfword:
[mr_halfword@skylake-alma ~]$ sudo ip -6 addr add fe80:0000:0000:0000:9a03:9bff:fe77:e153/64 scope link dev ens1f1
The test could then be run:
[mr_halfword@skylake-alma release]$ ibv_generate_infiniband_test_load/ibv_generate_infiniband_test_load 0 1
PRBS32 pattern period is 4294967295
Press Ctrl-C to stop the RDMA test load
^C
rocep33s0f1 port 1 rx_buffer compare : PASS
rocep33s0f0 port 1 -> rocep33s0f1 port 1 RDMA write transmitted 69524783104 bytes in 61.683937 seconds, 1127.1 Mbytes/sec
rocep33s0f0 port 1 type RoCE V2 transmitted 75254331456 bytes in 61.683937 seconds, 1220.0 Mbytes/sec
rocep33s0f0 port 1 type RoCE V2 received 75258079592 bytes in 61.683937 seconds, 1220.1 Mbytes/sec
rocep33s0f0 port 1 rx_buffer compare : PASS
rocep33s0f1 port 1 -> rocep33s0f0 port 1 RDMA write transmitted 69524783104 bytes in 61.682691 seconds, 1127.1 Mbytes/sec
rocep33s0f1 port 1 type RoCE V2 transmitted 75258079592 bytes in 61.682691 seconds, 1220.1 Mbytes/sec
rocep33s0f1 port 1 type RoCE V2 received 75254331456 bytes in 61.682691 seconds, 1220.0 Mbytes/sec
Manually increased the Ethernet MTU to 9600:
[mr_halfword@skylake-alma ~]$ sudo ip link set ens1f0 mtu 9600
[mr_halfword@skylake-alma ~]$ sudo ip link set ens1f1 mtu 9600
Which increased the RDMA MTU to 4096.
Results:
[mr_halfword@skylake-alma release]$ ibv_generate_infiniband_test_load/ibv_generate_infiniband_test_load 0 1
PRBS32 pattern period is 4294967295
Press Ctrl-C to stop the RDMA test load
^C
rocep33s0f1 port 1 rx_buffer compare : PASS
rocep33s0f0 port 1 -> rocep33s0f1 port 1 RDMA write transmitted 300916146176 bytes in 249.843426 seconds, 1204.4 Mbytes/sec
rocep33s0f0 port 1 type RoCE V2 transmitted 309655784520 bytes in 249.843426 seconds, 1239.4 Mbytes/sec
rocep33s0f0 port 1 type RoCE V2 received 309392154676 bytes in 249.843426 seconds, 1238.3 Mbytes/sec
rocep33s0f0 port 1 rx_buffer compare : PASS
rocep33s0f1 port 1 -> rocep33s0f0 port 1 RDMA write transmitted 300647710720 bytes in 249.646860 seconds, 1204.3 Mbytes/sec
rocep33s0f1 port 1 type RoCE V2 transmitted 309392221868 bytes in 249.646860 seconds, 1239.3 Mbytes/sec
rocep33s0f1 port 1 type RoCE V2 received 309655784520 bytes in 249.646860 seconds, 1240.4 Mbytes/sec
Rather than manually adding IPv6 addresses to allow RoCEv2 to be used, 7.5 Changing NetworkManger configuration to generat IPv6 link-scope address based upon the MAC address could be more maintainable.
Dual 40Gb/s Infiniband ports, with a PCIe gen2 x8 interface (4 GB/s peak bandwidth). Ubuntu 24.04.3 LTS with Kernel 6.8.0-94-generic
Results:
$ ibv_generate_infiniband_test_load/ibv_generate_infiniband_test_load 0
PRBS32 pattern period is 4294967295
Press Ctrl-C to stop the RDMA test load
^C
ibp3s0 port 2 rx_buffer compare : PASS
ibp3s0 port 1 -> ibp3s0 port 2 RDMA write transmitted 27435177345024 bytes in 16726.068460 seconds, 1640.3 Mbytes/sec
ibp3s0 port 1 transmitted 27596666524268 bytes in 16726.068460 seconds, 1649.9 Mbytes/sec
ibp3s0 port 1 received 27596666528580 bytes in 16726.068460 seconds, 1649.9 Mbytes/sec
ibp3s0 port 1 rx_buffer compare : PASS
ibp3s0 port 2 -> ibp3s0 port 1 RDMA write transmitted 27435177345024 bytes in 16726.068468 seconds, 1640.3 Mbytes/sec
ibp3s0 port 2 transmitted 27596666047044 bytes in 16726.068468 seconds, 1649.9 Mbytes/sec
ibp3s0 port 2 received 27596666042732 bytes in 16726.068468 seconds, 1649.9 Mbytes/sec