[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] disk performance about half in domU? + question about XenSource


  • To: xen-users@xxxxxxxxxxxxxxxxxxx
  • From: Johnn Tan <linuxweb@xxxxxxxxx>
  • Date: Tue, 14 Aug 2007 11:04:13 -0400
  • Delivery-date: Tue, 14 Aug 2007 08:04:54 -0700
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:user-agent:mime-version:to:subject:references:in-reply-to:content-type:content-transfer-encoding; b=SDa2/fDK4wQSaQF8HNc02ZfoIrwlYLBI7z9DPZ9azNFbDVcG4Jkdbk49En6/jyEFCoUs7RGnVqti8Mwq8rpVZhwdkx152ediOVLbG9AdgJY7PfXowEBGJKZs7vhaOhiHQ/aphu/qi64oYzriHzAPbLGeyM45/ljWW7+6NWDzj78=
  • List-id: Xen user discussion <xen-users.lists.xensource.com>

Mark Williamson wrote:
Sounds a bit weird. How many CPUs in this box? My memories of the benchmarks suggest this should be better than you're seeing, but maybe your workload is tickling some bad cases or something...

Hi Mark:

We have a quad-core processor and 8GB RAM. I gave the domU all 4 VCPUs (didn't know you could do that! -- you can't do it with memory). I assigned 7.5GB RAM to the domU.

This was using CentOS-5 x86_64 and mysql 5.1.20-beta.


I know this sounds really weird but when comparing to native performance you do need to test on the same area of the disk, have you done this? Portions of the disk nearer to the outside edge of the platter can have significantly higher transfer rates due to moving at a higher linear velocity.

No, we didn't specifically test the same area of the disk. We have a hardware RAID-10 (PERC 5 on Dell PE 2950) using 128K chunks. But I think we did enough tests to see a pattern emerging.

Granted, we also used sql-bench and the results for the domU were actually okay. But we believe our own sql tests more specifically focus on the disk I/O. But I also ran unixbench since that's a bit more standardized, and it seems to indicate some of the same results. I don't know how helpful it is, but I went ahead and enclosed those results below.


XenEnterprise and friends add optimised disk drivers for Windows, but if you're running paravirtualised Linux then you've already got optimised disk drivers, so the commercial product probably won't help.

Ok thanks for clarifying.

We haven't yet tested Xen 3.1 -- do you know if that includes any I/O performance enhancements?

johnn


Here's the unixbench results for CentOS-5 x86_64 on a physical machine (non-xen):

======================================================

  BYTE UNIX Benchmarks (Version 4.1.0)
System -- Linux sql1 2.6.18-8.1.8.el5 #1 SMP Tue Jul 10 06:39:17 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux
  Start Benchmark Run: Wed Aug  8 18:15:57 EDT 2007
   1 interactive users.
   18:15:57 up 9 min,  1 user,  load average: 0.01, 0.04, 0.00
  lrwxrwxrwx 1 root root 4 Aug  8 16:41 /bin/sh -> bash
  /bin/sh: symbolic link to `bash'
  /dev/sda1              8123168   1189948   6513928  16% /
Dhrystone 2 using register variables 8893810.7 lps (10.0 secs, 10 samples) Double-Precision Whetstone 1781.2 MWIPS (9.9 secs, 10 samples) System Call Overhead 473046.7 lps (10.0 secs, 10 samples) Pipe Throughput 492008.6 lps (10.0 secs, 10 samples) Pipe-based Context Switching 96405.6 lps (10.0 secs, 10 samples) Process Creation 8338.6 lps (30.0 secs, 3 samples) Execl Throughput 2372.1 lps (29.8 secs, 3 samples) File Read 1024 bufsize 2000 maxblocks 844117.0 KBps (30.0 secs, 3 samples) File Write 1024 bufsize 2000 maxblocks 478062.0 KBps (30.0 secs, 3 samples) File Copy 1024 bufsize 2000 maxblocks 283041.0 KBps (30.0 secs, 3 samples) File Read 256 bufsize 500 maxblocks 235142.0 KBps (30.0 secs, 3 samples) File Write 256 bufsize 500 maxblocks 126786.0 KBps (30.0 secs, 3 samples) File Copy 256 bufsize 500 maxblocks 80142.0 KBps (30.0 secs, 3 samples) File Read 4096 bufsize 8000 maxblocks 1755004.0 KBps (30.0 secs, 3 samples) File Write 4096 bufsize 8000 maxblocks 931432.0 KBps (30.0 secs, 3 samples) File Copy 4096 bufsize 8000 maxblocks 601062.0 KBps (30.0 secs, 3 samples) Shell Scripts (1 concurrent) 5410.3 lpm (60.0 secs, 3 samples) Shell Scripts (8 concurrent) 1721.7 lpm (60.0 secs, 3 samples) Shell Scripts (16 concurrent) 925.7 lpm (60.0 secs, 3 samples) Arithmetic Test (type = short) 1243892.9 lps (10.0 secs, 3 samples) Arithmetic Test (type = int) 1265520.4 lps (10.0 secs, 3 samples) Arithmetic Test (type = long) 326020.1 lps (10.0 secs, 3 samples) Arithmetic Test (type = float) 925118.2 lps (10.0 secs, 3 samples) Arithmetic Test (type = double) 512452.7 lps (10.0 secs, 3 samples) Arithoh 227641190.3 lps (10.0 secs, 3 samples) C Compiler Throughput 978.7 lpm (60.0 secs, 3 samples) Dc: sqrt(2) to 99 decimal places 84417.7 lpm (30.0 secs, 3 samples) Recursion Test--Tower of Hanoi 81500.4 lps (20.0 secs, 3 samples)


                     INDEX VALUES
TEST BASELINE RESULT INDEX

Dhrystone 2 using register variables 116700.0 8893810.7 762.1 Double-Precision Whetstone 55.0 1781.2 323.9 Execl Throughput 43.0 2372.1 551.7 File Copy 1024 bufsize 2000 maxblocks 3960.0 283041.0 714.8 File Copy 256 bufsize 500 maxblocks 1655.0 80142.0 484.2 File Copy 4096 bufsize 8000 maxblocks 5800.0 601062.0 1036.3 Pipe Throughput 12440.0 492008.6 395.5 Process Creation 126.0 8338.6 661.8 Shell Scripts (8 concurrent) 6.0 1721.7 2869.5 System Call Overhead 15000.0 473046.7 315.4

      =========
FINAL SCORE 640.2

======================================================

And here's the results for the exact same setup on a domU using same physical hardware:

  BYTE UNIX Benchmarks (Version 4.1.0)
System -- Linux store-nyc377-vmsql02.limewire.com 2.6.18-8.1.8.el5xen #1 SMP Tue Jul 10 07:06:45 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux
  Start Benchmark Run: Thu Aug  9 14:34:36 EDT 2007
   1 interactive users.
14:34:36 up 1 day, 20:41, 1 user, load average: 0.16, 0.04, 0.01
  lrwxrwxrwx 1 root root 4 Jul 17 17:54 /bin/sh -> bash
  /bin/sh: symbolic link to `bash'
  /dev/xvda1             6092360   3795932   1981960  66% /
Dhrystone 2 using register variables 8931957.1 lps (10.0 secs, 10 samples) Double-Precision Whetstone 1788.3 MWIPS (9.9 secs, 10 samples) System Call Overhead 174628.7 lps (10.0 secs, 10 samples) Pipe Throughput 211195.6 lps (10.0 secs, 10 samples) Pipe-based Context Switching 54065.7 lps (10.0 secs, 10 samples) Process Creation 2336.0 lps (30.0 secs, 3 samples) Execl Throughput 948.3 lps (29.7 secs, 3 samples) File Read 1024 bufsize 2000 maxblocks 404623.0 KBps (30.0 secs, 3 samples) File Write 1024 bufsize 2000 maxblocks 254555.0 KBps (30.0 secs, 3 samples) File Copy 1024 bufsize 2000 maxblocks 145873.0 KBps (30.0 secs, 3 samples) File Read 256 bufsize 500 maxblocks 105386.0 KBps (30.0 secs, 3 samples) File Write 256 bufsize 500 maxblocks 65650.0 KBps (30.0 secs, 3 samples) File Copy 256 bufsize 500 maxblocks 39343.0 KBps (30.0 secs, 3 samples) File Read 4096 bufsize 8000 maxblocks 1103909.0 KBps (30.0 secs, 3 samples) File Write 4096 bufsize 8000 maxblocks 645148.0 KBps (30.0 secs, 3 samples) File Copy 4096 bufsize 8000 maxblocks 398281.0 KBps (30.0 secs, 3 samples) Shell Scripts (1 concurrent) 2693.7 lpm (60.0 secs, 3 samples) Shell Scripts (8 concurrent) 638.0 lpm (60.0 secs, 3 samples) Shell Scripts (16 concurrent) 335.3 lpm (60.0 secs, 3 samples) Arithmetic Test (type = short) 1244421.3 lps (10.0 secs, 3 samples) Arithmetic Test (type = int) 1267033.4 lps (10.0 secs, 3 samples) Arithmetic Test (type = long) 326565.1 lps (10.0 secs, 3 samples) Arithmetic Test (type = float) 926358.0 lps (10.0 secs, 3 samples) Arithmetic Test (type = double) 513037.0 lps (10.0 secs, 3 samples) Arithoh 227902041.2 lps (10.0 secs, 3 samples) C Compiler Throughput 761.3 lpm (60.0 secs, 3 samples) Dc: sqrt(2) to 99 decimal places 37142.6 lpm (30.0 secs, 3 samples) Recursion Test--Tower of Hanoi 81742.8 lps (20.0 secs, 3 samples)


                     INDEX VALUES
TEST BASELINE RESULT INDEX

Dhrystone 2 using register variables 116700.0 8931957.1 765.4 Double-Precision Whetstone 55.0 1788.3 325.1 Execl Throughput 43.0 948.3 220.5 File Copy 1024 bufsize 2000 maxblocks 3960.0 145873.0 368.4 File Copy 256 bufsize 500 maxblocks 1655.0 39343.0 237.7 File Copy 4096 bufsize 8000 maxblocks 5800.0 398281.0 686.7 Pipe Throughput 12440.0 211195.6 169.8 Process Creation 126.0 2336.0 185.4 Shell Scripts (8 concurrent) 6.0 638.0 1063.3 System Call Overhead 15000.0 174628.7 116.4

      =========
FINAL SCORE 324.3

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.