Xen project Mailing List

[Xen-devel] [PATCH] Quick path for PIO instructions which cut more than half of the expense

To: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>

From: "Xiang, Kai" <kai.xiang@xxxxxxxxx>

Date: Mon, 22 Dec 2008 18:16:14 +0800

Accept-language: en-US

Acceptlanguage: en-US

Delivery-date: Mon, 22 Dec 2008 02:16:42 -0800

List-id: Xen developer discussion <xen-devel.lists.xensource.com>

Thread-index: AclkHlHyULdzldIRRfyHPzvdmtg6NA==

Thread-topic: [PATCH] Quick path for PIO instructions which cut more than half of the expense

Hi all: Happy Holidays to you :) We found the PIO instruction path is changed in the Xen 3.3 tree compare to earlier Xen 3.1 tree. We suspect this will put more burdens for Xen itself, which hurt the performance. This patch is worked out to address this issue (c/s: 18933), which gives a short path for none-string PIO. To demonstrate how much performance influence this could bring in, we have experiment/data as below: 1) Direct TSC Data from Xentrace We use a small piece of code to read port repetitively which runs in a RHEL5 guest. And collect the xentrace data at the same time. We see the TSC from VMEXIT to blocked_to_runnable (This could be viewed as one indicator for code path handling VMEXIT inside) is ~59% cut off (from 2616 to 1064) 2) Port IO TSC observed from this piece of code This includes the response from the QEMU side. While we can also get about ~18% TSC reduced for one simple PIO (From 16112 to 13296) 3) The influence for more realistic workloads: We tested on Windows 2003 Server Guest, while using IOmeter to run a Disk bound test, the IO pattern is "Default" which use 67% random read and 33% random write with 2K request size. To reduce the influence of file cache, I run 3 times (1 minutes each) from the start of the computer (both xen and the guest) Compare before and after IO per second (3 runs) | average response time (3 runs) ---------------------------------------------------------------- Before: 100.004; 109.447; 110.801 | 9.988; 9.133; 9.022 After: 101.951; 110.893; 114.179 | 9.806; 9.016; 8.756 ------------------------------------------------------------------ So we are having a 1%~3% percent IO performance gain while reduce the average response time by 2%~3% at the same time. Considering this is just an ordinary SATA disk and an IO bound workload, we are expecting more with faster Disks and more cached IO. &BTW: And I also fix one wrong comments in the patch. Looking forwards your feedback, Thanks in advantages. Best wishes Kai -------------------------------------------------------------------- Backups: We attached the piece of test code in the attachments also: And the configurations as below: Intel(r) Supermicro Tylersburg-EP Server System CPU Info: 2x Quad-core processor 2.8GHZ with 8MB L3 Cache (Nehalem) Disk: Seagate SATA 500G Memory Info 12GB memory (12 x 1GB DDR3 1066MHZ) Guest configurations: Memory: 512MB VCPU: 1 Device model: Stub domain IO meter test status: Two visual disks used: hda for system and hdb for test hard disk for IO meter.

Attachment: pio_quickpath.patch
Description: pio_quickpath.patch

Attachment: pio.c
Description: pio.c

_______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.