2.6.16-rc6 Transparent Paravirtualization Performance Scoreboard2.6.16-rc6 Transparent Paravirtualization Performance Scoreboard Updated: 03/20/2006 * Contact: Anne Holler (anne@xxxxxxxxxx) Throughput benchmarks -> HIGHER IS BETTER -> Higher ratio is better P4 Opteron VMI-Native/Native VMI-Native/Native Comments Dbench 1client 1.00 [312/311] 1.00 [425/425] Netperf Receive 1.00 [948/947] 1.00 [937/937] CpuUtil:P4(VMI:43%,Ntv:42%);Opteron(VMI:36%,Ntv:34%) Send 1.00 [939/939] 1.00 [937/936] CpuUtil:P4(VMI:25%,Ntv:25%);Opteron(VMI:62%,Ntv:60%) Latency benchmarks -> LOWER IS BETTER -> Lower ratio is better P4 Opteron VMI-Native/Native VMI-Native/Native Comments Kernel compile UP 1.00 [221/220] 1.00 [131/131] SMP/2way 1.00 [117/117] 1.00 [67/67] Lmbench process time latencies null call 1.00 [0.17/0.17] 1.00 [0.08/0.08] null i/o 1.00 [0.29/0.29] 0.92 [0.23/0.25] opteron: wide confidence interval stat 0.99 [2.14/2.16] 0.94 [2.25/2.39] opteron: odd, 1% outside wide confidence interval open clos 1.01 [3.00/2.96] 0.98 [3.16/3.24] slct TCP 1.00 [8.84/8.83] 0.94 [11.8/12.5] opteron: wide confidence interval sig inst 0.99 [0.68/0.69] 1.09 [0.36/0.33] opteron: best is 1.03 [0.34/0.33] sig hndl 0.99 [2.19/2.21] 1.05 [1.20/1.14] opteron: best is 1.02 [1.13/1.11] fork proc 1.02 [137/134] 1.04 [100/96] exec proc 1.02 [536/525] 1.03 [309/301] sh proc 1.01 [3204/3169] 1.02 [1551/1528] Lmbench context switch time latencies 2p/0K 1.00 [2.84/2.84] 1.14 [0.74/0.65] opteron: wide confidence interval 2p/16K 1.01 [2.98/2.95] 0.93 [0.74/0.80] opteron: wide confidence interval 2p/64K 1.02 [3.06/3.01] 1.00 [4.19/4.18] 8p/16K 1.02 [3.31/3.26] 0.97 [1.86/1.91] 8p/64K 1.01 [30.4/30.0] 1.00 [4.33/4.34] 16p/16K 0.96 [7.76/8.06] 0.97 [2.03/2.10] 16p/64K 1.00 [41.5/41.4] 1.00 [15.9/15.9] Lmbench system latencies Mmap 1.02 [6681/6542] 1.00 [3452/3441] Prot Fault 1.06 [0.920/0.872] 1.07 [0.197/0.184] p4+opteron: wide confidence interval Page Fault 1.01 [2.065/2.050] 1.00 [1.10/1.10] Kernel Microbenchmarks getppid 1.00 [1.70/1.70] 1.00 [0.83/0.83] segv 0.99 [7.05/7.09] 1.08 [2.95/2.72] forkwaitn 1.02 [3.60/3.54] 1.05 [2.61/2.48] divzero 0.99 [5.68/5.73] 1.09 [2.71/2.48] System Configurations: P4: CPU: 2.4GHz; MEM: 1024MB; DISK: 10K SCSI; Server+Client NICs: Intel e1000 server adapter Opteron: CPU: 2.2Ghz; MEM: 1024MB; DISK: 10K SCSI; Server+Client NICs: Broadcom NetXtreme BCM5704 UP kernel used for all workloads except SMP kernel compile Benchmark Descriptions: Dbench: repeat N times until 95% confidence interval 5% around mean; report mean version 2.0 run as "time ./dbench -c client_plain.txt 1" Netperf: best of 5 runs MessageSize:8192+SocketSize:65536; netperf -H client-ip -l 60 -t TCP_STREAM Kernel compile: best of 3 runs Build of 2.6.11 kernel w/gcc 4.0.2 via "time make -j 16 bzImage" Lmbench: average of best 18 of 30 runs version 3.0-a4; obtained from sourceforge Kernel microbenchmarks: average of best 3 of 5 runs getppid: loop of 10 calls to getppid, repeated 1,000,000 times segv: signal of SIGSEGV, repeated 3,000,000 times forkwaitn: fork/wait for child to exit, repeated 40,000 times divzero: divide by 0 fault 3,000,000 times