[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] domU using linux-2.6.37-xen-next pvops kernel with CONFIG_PARAVIRT_SPINLOCKS disabled results in 150% performance improvement (updated)



On Mon, Dec 20, 2010 at 05:03:13PM -0800, Dante Cinco wrote:
> (Sorry, I accidentally sent the previous post before finishing the summary
> table)
> 
> For a couple of months now, we've been trying to track down the slow I/O
> performance in pvops domU. Our system has 16 Fibre Channel devices, all
> PCI-passthrough to domU. We were previously using a 2.6.32 (Ubuntu version)
> HVM kernel and were getting 511k IOPS. We switched to pvops with Konrad's
> xen-pcifront-0.8.2 kernel and were disappointed to see the performance
> degrade to 11k IOPS. After disabling some kernel debug options including
> KMEMLEAK, the performance jumped to 186k IOPS but still well below what we
> were getting with the HVM kernel. We tried disabling spinlock debugging in
> the kernel but it actually resulted in a drop in performance to 70k IOPS.
> 
> Last week we switched to linux-2.6.37-xen-next and with the same kernel
> debug options disabled, the I/O performance was slightly better at 211k
> IOPS. We tried disabling spinlock debugging again and saw a similar drop in
> performance to 58k IOPS. We searched around for any performance-related
> posts regarding pvops and found two references to CONFIG_PARAVIRT_SPINLOCKS
> (one from Jeremy and one from Konrad):
> http://lists.xensource.com/archives/html/xen-devel/2009-05/msg00660.html
> http://lists.xensource.com/archives/html/xen-devel/2010-11/msg01111.html
> 
> Both posts recommended (Konrad strongly) enabling PARAVIRT_SPINLOCKS when
> running under Xen. Since it's enabled by default, we decided to see what
> would happen if we disabled CONFIG_PARAVIRT_SPINLOCKS. With the spinlock
> debugging enabled, we were getting 205k IOPS but with spinlock debugging
> disabled, the performance leaped to 522k IOPS !!!
> 
> I'm assuming that this behavior is unexpected.

<scratches his head> You got me. I am really happy to find out that you guys
were able to solve this conundrum.

Are the guests contending for the CPUs (so say you have 4 logical CPUs and
you launch two guests, each wanting 4 vCPUs)? How many CPUs do the guests have?
Are the guests pinned to the CPUS? What is the scheduler in the Hypervisor? 
credit1?
> 
> Here's a summary of the kernels, config changes and performance (in IOPS):
> 
>                       pcifront   linux
>                       0.8.2      2.6.37-xen-next
>                       pvops      pvops
> Spinlock
> debugging enabled,     186k       205k
> PARAVIRT_SPINLOCKS=y
> 
> Spinlock
> debugging disabled,     70k        58k
> PARAVIRT_SPINLOCKS=y
> 
> Spinlock
> debugging disabled,    247k       522k
> PARAVIRT_SPINLOCKS=n

Whoa....  Thank you for the table. My first thought is that: "whoa, PV 
byte-locking
spinlocks sure sucks", but then I realized that there are some improvements in
2.6.37-xen-next. Like in the vmap flushing code .. 


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.