2.6.16-rc6 Transparent Paravirtualization Performance Scoreboard2.6.16-rc6 Transparent Paravirtualization Performance Scoreboard
Updated: 03/20/2006 * Contact: Anne Holler (anne@xxxxxxxxxx)

Throughput benchmarks -> HIGHER IS BETTER -> Higher ratio is better
                     P4                  Opteron 
                     VMI-Native/Native   VMI-Native/Native   Comments
 Dbench
  1client            1.00 [312/311]      1.00 [425/425]
 Netperf
  Receive            1.00 [948/947]      1.00 [937/937]      CpuUtil:P4(VMI:43%,Ntv:42%);Opteron(VMI:36%,Ntv:34%)
  Send               1.00 [939/939]      1.00 [937/936]      CpuUtil:P4(VMI:25%,Ntv:25%);Opteron(VMI:62%,Ntv:60%)

Latency benchmarks -> LOWER IS BETTER -> Lower ratio is better
                     P4                  Opteron 
                     VMI-Native/Native   VMI-Native/Native   Comments
 Kernel compile
  UP                 1.00 [221/220]      1.00 [131/131]
  SMP/2way           1.00 [117/117]      1.00 [67/67]
 Lmbench process time latencies
  null call          1.00 [0.17/0.17]    1.00 [0.08/0.08]
  null i/o           1.00 [0.29/0.29]    0.92 [0.23/0.25]    opteron: wide confidence interval
  stat               0.99 [2.14/2.16]    0.94 [2.25/2.39]    opteron: odd, 1% outside wide confidence interval
  open clos          1.01 [3.00/2.96]    0.98 [3.16/3.24]
  slct TCP           1.00 [8.84/8.83]    0.94 [11.8/12.5]    opteron: wide confidence interval
  sig inst           0.99 [0.68/0.69]    1.09 [0.36/0.33]    opteron: best is 1.03 [0.34/0.33]
  sig hndl           0.99 [2.19/2.21]    1.05 [1.20/1.14]    opteron: best is 1.02 [1.13/1.11]
  fork proc          1.02 [137/134]      1.04 [100/96]
  exec proc          1.02 [536/525]      1.03 [309/301]
  sh proc            1.01 [3204/3169]    1.02 [1551/1528]
 Lmbench context switch time latencies
  2p/0K              1.00 [2.84/2.84]    1.14 [0.74/0.65]    opteron: wide confidence interval
  2p/16K             1.01 [2.98/2.95]    0.93 [0.74/0.80]    opteron: wide confidence interval
  2p/64K             1.02 [3.06/3.01]    1.00 [4.19/4.18]
  8p/16K             1.02 [3.31/3.26]    0.97 [1.86/1.91]
  8p/64K             1.01 [30.4/30.0]    1.00 [4.33/4.34]
  16p/16K            0.96 [7.76/8.06]    0.97 [2.03/2.10]
  16p/64K            1.00 [41.5/41.4]    1.00 [15.9/15.9]
 Lmbench system latencies
  Mmap               1.02 [6681/6542]    1.00 [3452/3441]
  Prot Fault         1.06 [0.920/0.872]  1.07 [0.197/0.184]  p4+opteron: wide confidence interval
  Page Fault         1.01 [2.065/2.050]  1.00 [1.10/1.10]
 Kernel Microbenchmarks
  getppid            1.00 [1.70/1.70]    1.00 [0.83/0.83]
  segv               0.99 [7.05/7.09]    1.08 [2.95/2.72]
  forkwaitn          1.02 [3.60/3.54]    1.05 [2.61/2.48]
  divzero            0.99 [5.68/5.73]    1.09 [2.71/2.48]

System Configurations:
 P4:      CPU: 2.4GHz; MEM: 1024MB; DISK: 10K SCSI; Server+Client NICs: Intel e1000 server adapter
 Opteron: CPU: 2.2Ghz; MEM: 1024MB; DISK: 10K SCSI; Server+Client NICs: Broadcom NetXtreme BCM5704
 UP kernel used for all workloads except SMP kernel compile

Benchmark Descriptions:
 Dbench: repeat N times until 95% confidence interval 5% around mean; report mean
  version 2.0 run as "time ./dbench -c client_plain.txt 1"
 Netperf: best of 5 runs
  MessageSize:8192+SocketSize:65536; netperf -H client-ip -l 60 -t TCP_STREAM
 Kernel compile: best of 3 runs
  Build of 2.6.11 kernel w/gcc 4.0.2 via "time make -j 16 bzImage"
 Lmbench: average of best 18 of 30 runs
  version 3.0-a4; obtained from sourceforge
 Kernel microbenchmarks: average of best 3 of 5 runs
  getppid: loop of 10 calls to getppid, repeated 1,000,000 times
  segv: signal of SIGSEGV, repeated 3,000,000 times
  forkwaitn: fork/wait for child to exit, repeated 40,000 times
  divzero: divide by 0 fault 3,000,000 times