VMI and MPICH-VMI Latency
The VMI latency data graphed below was measured using the latency benchmark installed in the benchmarks directory of VMI install tree. latency measures the ping-pong latency of an interconnect by averaging the latency for n messages, each of size s, where n and s is specified by the user.
The MPICH-VMI latency was measured using latency_mpi benchmark installed in the tools directory of MPICH-VMI install tree. latency_mpi calculates both ping-pong latency (round-trip latency) and one-way latency of the interconnect using the MPI messaging interface.
MPICH-VMI, by default, switches from eager protocol (implemented as streams) to rendezvous protocol (implemented as RDMAs) for sending/receiving data of size 16k and over (The default message length over which rendezvous is used can also be set by the user on command line). So for all MPICH-VMI latency curves plotted below, data points over 16k message size are measurements of latency for rendezvous protocol over RDMA.
IA-32 Latency Measurements
To meaure bandwidth between IA-32 machines, we used 2 Dell PowerEdge 1750 servers with 2 hyperthreaded Intel Xeon 3.06 GHz processors (3 GB RAM, 512k L2 cache) running Linux 2.4.20-30smp kernels for all bandwidth measurements.
For Infiniband, both machines had InfiniHost MT23108 HCA (a1 silicon) on 64 bit PCIX 133MHz bus with thca-3.1.
|
Msg Size (Bytes) |
VMI Latency (usec) |
MPI Latency (usec) |
MPI Latency (usec) |
|
0 |
|
1.77 |
5.35 |
|
1 |
7.4815 |
1.74 |
5.34 |
|
2 |
7.7615 |
1.78 |
5.34 |
|
4 |
7.539 |
1.74 |
5.35 |
|
8 |
7.6405 |
1.75 |
5.17 |
|
16 |
7.701 |
1.78 |
5.27 |
|
32 |
7.7645 |
1.77 |
6.23 |
|
64 |
8.935 |
1.86 |
6.38 |
|
128 |
9.172 |
1.85 |
6.63 |
|
256 |
9.761 |
1.91 |
7.14 |
|
512 |
10.943 |
1.94 |
8.13 |
|
1024 |
13.26 |
2.17 |
10.11 |
|
2048 |
14.09 |
2.79 |
12.01 |
|
4096 |
16.311 |
3.92 |
15.91 |
|
8192 |
21.204 |
6.38 |
23.57 |
|
16384 |
27.599 |
21.99 |
41.49 |
|
32768 |
45.91 |
31.35 |
60.12 |
|
65536 |
82.534 |
50.32 |
97.83 |
|
131072 |
155.91 |
88.08 |
173.2 |
|
262144 |
302.69 |
163.67 |
324.14 |
|
524288 |
767.82 |
315.02 |
626.03 |
|
1048576 |
1580.9 |
617.7 |
1229.82 |
|
2097152 |
3157.5 |
1230.85 |
2464.59 |
|
4194304 |
6369.4 |
2449.79 |
4912.63 |
|
|
For Myrinet, both machines had LANai 10 on 64 bit PCIX 100MHz bus with gm-2.0.11 driver.
|
Msg Size (Bytes) |
VMI Latency (usec) |
MPI Latency (usec) |
MPI Latency (usec) |
|
0 |
|
3.06 |
8.1 |
|
1 |
6.559 |
3.04 |
8.29 |
|
2 |
6.5475 |
3.04 |
8.17 |
|
4 |
6.5505 |
3.05 |
8.16 |
|
8 |
6.6135 |
3.04 |
7.87 |
|
16 |
6.8275 |
3.05 |
8.06 |
|
32 |
6.7745 |
3.05 |
8.22 |
|
64 |
6.9785 |
3.07 |
8.38 |
|
128 |
8.0425 |
3.1 |
8.91 |
|
256 |
10.412 |
3.13 |
9.8 |
|
512 |
12.775 |
3.28 |
11.41 |
|
1024 |
16.117 |
3.52 |
14.86 |
|
2048 |
22.362 |
4.38 |
21.97 |
|
4096 |
31.069 |
8.5 |
31.78 |
|
8192 |
49.44 |
16.8 |
52.12 |
|
16384 |
84.442 |
51.25 |
99.88 |
|
32768 |
150.83 |
84.57 |
166.1 |
|
65536 |
283.11 |
150.92 |
298.65 |
|
131072 |
548.13 |
283.57 |
563.51 |
|
262144 |
1079.8 |
548.74 |
1096.66 |
|
524288 |
2143.2 |
1079.28 |
2153.68 |
|
1048576 |
4259.5 |
2151.03 |
4273.87 |
|
2097152 |
8498.9 |
4273.02 |
8526.98 |
|
4194304 |
16980 |
8510.22 |
17010.5 |
|
|
for Ethernet, Broadcom Gigabit Ethernet NIC with bcm5700 driver was used.
|
Msg Size (Bytes) |
VMI Latency (usec) |
MPI Latency (usec) |
MPI Latency (usec) |
|
0 |
|
67.29 |
61.7 |
|
1 |
63.876 |
12.52 |
59.6 |
|
2 |
63.825 |
12.53 |
59.53 |
|
4 |
63.648 |
12.59 |
59.51 |
|
8 |
63.585 |
12.52 |
59.65 |
|
16 |
64.417 |
12.55 |
60.01 |
|
32 |
66.872 |
12.64 |
60.25 |
|
64 |
67.19 |
12.66 |
61.99 |
|
128 |
69.098 |
12.84 |
63.78 |
|
256 |
73.893 |
13.11 |
67.84 |
|
512 |
81.436 |
12.88 |
76.17 |
|
1024 |
95.627 |
13.72 |
90.2 |
|
2048 |
118.56 |
13.79 |
115.93 |
|
4096 |
171.4 |
21.67 |
166.75 |
|
8192 |
281.16 |
38.61 |
282.81 |
|
16384 |
346.35 |
203.39 |
783.67 |
|
32768 |
473.5 |
267.55 |
608.13 |
|
65536 |
748.15 |
407.03 |
979.32 |
|
131072 |
1279.8 |
670.2 |
1398.55 |
|
262144 |
2369 |
1238.92 |
3102.5 |
|
524288 |
4487.2 |
2280.46 |
4634.07 |
|
1048576 |
8767.1 |
4422.39 |
9152.83 |
|
2097152 |
17282 |
8689.43 |
17438.63 |
|
4194304 |
34325 |
17201.88 |
34447.29 |
|
|
IA-64 Latency Measurements
To meaure bandwidth between IA-64 machines over Myrinet, we used 2 Intel Itanium 2 duals with 1.3 Ghz processors (4 GB RAM) running Suse SLES8 system with 2.4.21 kernel. Infiniband, GigE and shared memory measurement were done with 2 Intel Itanium 2 duals with 900 Mhz processors (2 GB RAM) running 2.4.21 kernel.
For Infiniband, both machines had InfiniHost MT23108 HCA (a1 silicon) on 64 bit PCIX 133MHz bus with thca-3.1.
|
Msg Size (Bytes) |
VMI Latency (usec) |
MPI Latency (usec) |
MPI Latency (usec) |
|
0 |
|
3.96 |
7.97 |
|
1 |
10.193 |
3.98 |
7.97 |
|
2 |
10.17 |
3.96 |
7.95 |
|
4 |
10.213 |
3.97 |
7.95 |
|
8 |
10.243 |
3.97 |
8.03 |
|
16 |
10.301 |
3.88 |
8.77 |
|
32 |
10.487 |
3.84 |
8.8 |
|
64 |
11.694 |
3.92 |
9.05 |
|
128 |
11.901 |
3.91 |
9.42 |
|
256 |
12.233 |
3.94 |
10.05 |
|
512 |
12.857 |
3.94 |
11.35 |
|
1024 |
14.713 |
4.03 |
13.27 |
|
2048 |
15.831 |
4.15 |
15.4 |
|
4096 |
17.858 |
4.8 |
19.78 |
|
8192 |
23.048 |
6.78 |
28.55 |
|
16384 |
33.666 |
28.61 |
54.43 |
|
32768 |
55.284 |
39.64 |
76.47 |
|
65536 |
98.578 |
61.74 |
120.72 |
|
131072 |
185.03 |
106.03 |
209.28 |
|
262144 |
357.93 |
194.65 |
386.32 |
|
524288 |
703.7 |
371.92 |
740.63 |
|
1048576 |
1394.9 |
726.27 |
1448.75 |
|
2097152 |
2780.5 |
1439.38 |
2881.57 |
|
4194304 |
5772.1 |
2869.08 |
5752.88 |
|
|
For Myrinet, both machines had LANai 10.0 on 64 bit PCIX 100MHz bus with gm-2.0.11 driver.
|
Msg Size (Bytes) |
VMI Latency (usec) |
MPI Latency (usec) |
MPI Latency (usec) |
|
0 |
|
3.74 |
11.93 |
|
1 |
9.8315 |
3.63 |
12.84 |
|
2 |
11.19 |
3.61 |
12.1 |
|
4 |
10.059 |
3.57 |
12.44 |
|
8 |
10.958 |
3.55 |
12.66 |
|
16 |
10.987 |
3.56 |
13.07 |
|
32 |
11.787 |
4.08 |
13.04 |
|
64 |
12.235 |
3.62 |
12.58 |
|
128 |
14.027 |
3.78 |
12.87 |
|
256 |
16.507 |
4.09 |
14.82 |
|
512 |
18.097 |
3.91 |
17.31 |
|
1024 |
21.661 |
4.37 |
20.8 |
|
2048 |
29.29 |
5.77 |
28.49 |
|
4096 |
41.239 |
9.39 |
42.3 |
|
8192 |
66.527 |
17.66 |
66.96 |
|
16384 |
98.584 |
62.19 |
118.49 |
|
32768 |
167.79 |
102.26 |
190.25 |
|
65536 |
311.48 |
177.99 |
334.06 |
|
131072 |
604.75 |
316.03 |
625.26 |
|
262144 |
1175.5 |
621.27 |
1210.48 |
|
524288 |
2340 |
1214.98 |
2370.88 |
|
1048576 |
4677.4 |
2395.16 |
4690.53 |
|
2097152 |
9316.2 |
4753.94 |
9343.18 |
|
4194304 |
18612 |
9468.52 |
18654.69 |
|
|
Broadcom Gigabit Ethernet NIC with bcm5700 driver was used for Ethernet.
|
Msg Size (Bytes) |
VMI Latency (usec) |
MPI Latency (usec) |
MPI Latency (usec) |
|
0 |
|
16.94 |
73.68 |
|
1 |
81.483 |
15.65 |
72.63 |
|
2 |
83.221 |
16.55 |
73.78 |
|
4 |
83.126 |
15.28 |
72.5 |
|
8 |
81.521 |
16.76 |
73.94 |
|
16 |
83.673 |
15.18 |
72.9 |
|
32 |
81.658 |
16.22 |
74.95 |
|
64 |
82.608 |
15.05 |
73.92 |
|
128 |
84.894 |
15.71 |
77.98 |
|
256 |
89.421 |
17.43 |
80.11 |
|
512 |
96.609 |
15.67 |
89.7 |
|
1024 |
111.87 |
16.97 |
103.39 |
|
2048 |
130.96 |
19.27 |
122.44 |
|
4096 |
155.5 |
23.69 |
147.89 |
|
8192 |
176.8 |
35.77 |
172.17 |
|
16384 |
254.44 |
171.26 |
407.17 |
|
32768 |
394.12 |
236.05 |
541.55 |
|
65536 |
706.22 |
386.82 |
928.86 |
|
131072 |
1276.1 |
720.57 |
1562.82 |
|
262144 |
2464.4 |
1256.09 |
2788.56 |
|
524288 |
4629 |
2446.51 |
4859.01 |
|
1048576 |
9102.6 |
4631.63 |
9345.74 |
|
2097152 |
18126 |
9108.6 |
18355.93 |
|
4194304 |
36034 |
18111.74 |
36304.89 |
|
|
Shared memory.
|
Msg Size (Bytes) |
MPI Latency (usec) |
MPI Latency (usec) |
|
0 |
0.46 |
1.66 |
|
1 |
0.53 |
1.78 |
|
2 |
0.47 |
1.72 |
|
4 |
0.48 |
1.74 |
|
8 |
0.5 |
1.72 |
|
16 |
0.48 |
1.73 |
|
32 |
0.51 |
1.77 |
|
64 |
0.45 |
2.08 |
|
128 |
0.5 |
2.18 |
|
256 |
0.53 |
2.44 |
|
512 |
0.6 |
2.91 |
|
1024 |
0.77 |
3.43 |
|
2048 |
1.13 |
4.33 |
|
4096 |
1.91 |
6.1 |
|
8192 |
3.48 |
9.68 |
|
16384 |
9.32 |
19.55 |
|
32768 |
16.12 |
33.32 |
|
65536 |
29.46 |
60.3 |
|
131072 |
55.9 |
114.07 |
|
262144 |
108.33 |
222.47 |
|
524288 |
224.28 |
458.8 |
|
1048576 |
710.98 |
1508.33 |
|
2097152 |
1902.46 |
3642.25 |
|
4194304 |
3891.65 |
7575.39 |
|
|
Back to the VMI2 web page.