[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-users] poor harddisk performance HVM domain (Windows 2003 guest)


  • To: "Joost van den Broek" <joost@xxxxxxxxxxxxx>, xen-users@xxxxxxxxxxxxxxxxxxx
  • From: "Petersson, Mats" <Mats.Petersson@xxxxxxx>
  • Date: Mon, 8 May 2006 11:26:36 +0200
  • Delivery-date: Mon, 08 May 2006 02:26:54 -0700
  • List-id: Xen user discussion <xen-users.lists.xensource.com>
  • Thread-index: AcZx2/9jUI7C9i1XRl2LqJjQhPKX4gAnx0AA
  • Thread-topic: [Xen-users] poor harddisk performance HVM domain (Windows 2003 guest)

> -----Original Message-----
> From: xen-users-bounces@xxxxxxxxxxxxxxxxxxx 
> [mailto:xen-users-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of 
> Joost van den Broek
> Sent: 07 May 2006 14:38
> To: xen-users@xxxxxxxxxxxxxxxxxxx
> Subject: Re: [Xen-users] poor harddisk performance HVM domain 
> (Windows 2003 guest)
> 
> Hi,
> 
> I'd like to see some confirmation on this one, since I'm 
> experimenting with this for days and being unable to get 
> acceptable transfer speeds. I thought such poor performance 
> should not happen with VT? 
> 
> It even gets worse when installing and using Ubuntu HVM, 
> can't enable DMA for the QEMU harddisk, resulting in a very 
> slow +-3.5MB/s read. Isn't there any way to resolve this?
> 
>  - Joost
> 
As discussed before, the QEMU is pretty primitive, but further on that,
the HVM solution (or any other virtualized solution aside from giivng
the virtual machine direct access to the real hardware, which in 90% of
cases with Hard-disks isn't an option - you'd need as many HD
controllers as DomU's (not just disks, but actual physical PCI
controller devices) - and that is assuming the device operates in
non-DMA mode, or otherwise we'd also need to have a IOMMU allowing the
guest OS's "physical" address into "real physical" address [which we
fake to the GUEST os so that it believes memory starts at 0 and goes to,
say, 256MB, when it REALLY starts at 256MB and goes to 512MB, for
example]. So a DMA operation where the OS says "send bytes at address
123456 to hard-disk", would have to be translated to "send bytes as
address 256MB + 123456 to hard-disk". There's three ways to solve this:
1. IOMMU - this works without any modification of the guest-OS. 
2. Para-virtualized OS - the base-OS is modified so that when it's
asking for a physical address, it knows that there are two kinds of
physical address: Guest physical and "real" physical (machine physical).

3. Para-virtualized driver - although the main OS isn't modified, the
guest's driver for the hard-disk would have some sort of modifications
to allow it to understand the actual double-layering of physical
addresses. 

Option one is the best choice of these - but still relies on each guest
having it's own hardware device (controller + disk) [or a multi-port
device that allows several guests to access it and the controller has a
separate "device" for each guest - I've heard that there are network
controllers that allow this at the moment]. 

Option two is fine for Linux (that's how a Linux DomU built for Xen
works), but OS's where the source code isn't easily accessible, this
doesn't quite work...

Option three is the only remaining viable option. 

Some further background: Whilst the actual access to the hard-disk is
done pretty well by Dom0 - at least it SHOULD be, or it needs fixing in
Dom0 - the overhead of running the disk access through the two layers of
Hypervisor (HVM code) and QEMU does add some noticable delays - bear in
mind that each sector being accessed in IDE will require writes to the
following registers:
Sector count
Sector number
Cylinder LSB
Cylinder MSB
Drive/Head
Command
[These CAN be combined, but many drivers write at least most of it
individually]
For a read, the driver will wait for an interrupt, for a write, it will
perform further IO reads to detect the drive being ready to accept the
data (QEMU is usually ready after the first poll) followed by a 512-byte
OUTS operation. 

Each one of these accesses cause an intercept, which means that the
guest is halted, the hypervisor invoked and the cause of the intercept
is checked and action taken. In the case of IO operations, this means
sending a message to QEMU, a context switch in the hypervisor to Dom0
(where QEMU runs), qemu interpreting the message and performing the
operation (storing the information about what we want to access on the
hard-disk until we get to the Command bit). 

When the command is complete, it will then perform the actual
disk-read/write (in the form of a fread/fwrite from the image file or
physical disk). Dom0 may well "sleep" (allow another context to run)
whilst it's waiting for the actual data to arrive from/to the disk. [And
here, DMA may well be used]

Once the data transfer is complete, an interrupt is issued to the guest
and it's restarted. The guest then reads the status register (another
intercept) to see if it worked out OK, and then, if it was a read, a
512-byte INS operation. 

There's a lot of passing information around, and there's a lot of
state-saving/restoring when context switching. Just a simple intercept
with no "real work" in the Hypervisor will take, probably, a couple of
thousand cycles to perform - and we have somewhere between 4 and 8 of
those for each disk-operation. We also have full-context switches
between DomU and Dom0 to run QEMU. 

A para-virtualized driver could do the same work in perhaps one set of
context switching and no more than two intercepts - so the task would be
done with much less overhead. 

--
Mats


_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.