[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Re: Improving hvm IO performance by using self IO emulator (YA io-emu?)

To: Mark Williamson <mark.williamson@xxxxxxxxxxxx>
From: Anthony Liguori <aliguori@xxxxxxxxxx>
Date: Thu, 22 Feb 2007 15:33:21 -0600
Cc: tgingold@xxxxxxx, xen-devel@xxxxxxxxxxxxxxxxxxx
Delivery-date: Thu, 22 Feb 2007 13:33:01 -0800
List-id: Xen developer discussion <xen-devel.lists.xensource.com>

Mark Williamson wrote:

The big problem with disk emulation isn't IO latency, but the fact that
the IDE emulation can only have one outstanding request at a time.  The
SCSI emulation helps this a lot.
IIRC, a real IDE can only have one outstanding request too (this may have
changed with AHCI).  This is really IIRC :-(
Can SATA drives queue multiple outstanding requests? Thought some newer revcould, but I may well be misremembering - in any case we'd want somethingthat was well supported.


SATA can, yes.  However, as you mention, SATA is very poorly supported.

The LSI scsi adapter seems to work quite nicely with Windows and Linux.And it supports TCQ. And it's already implemented :-) Can't reallybeat that :-)

I don't know what the bottle neck is in network emulation, but I suspect
the number of copies we have in the path has a great deal to do with it.
This reason seems obvious.
Latency may matter more to the network performance than it did to block,actually (especially given our current setup is fairly pessimal wrtlatency!). It would be interesting to see how much difference this makes.
In any case, copies are bad too :-) Presumably, hooking directly into theparavirt network channel would improve this situation too.
Perhaps the network device ought to be the first to move?


Can't say.  I haven't done much research on network performance.

There's a lot to like about this sort of approach.  It's not a silver
bullet wrt performance but I think the model is elegant in many ways.
An interesting place to start would be lapic/pit emulation.  Removing
this code from the hypervisor would be pretty useful and there is no
need to address PV-on-HVM issues.
Indeed this is the simpler code to move.  But why would it be useful ?
It might be a good proof of concept, and it simplifies the hypervisor (and themigration / suspend process) at the same time.
Does the firmware get loaded as an option ROM or is it a special portion
of guest memory that isn't normally reachable?
IMHO it should come with hvmload.  No needs to make it unreachable.
Mmmm. It's not like the guest can break security if it tampers with thedevice models in its own memory space.
Question: how does this compare with using a "stub domain" to run the devicemodels? The previous proposed approach was to automatically switch to thestub domain on trapping an IO by the HVM guest, and have that stub domain runthe device models, etc.

Reflecting is a bit more expensive than doing a stub domain. There isno way to wire up the VMEXITs to go directly into the guest so you'realways going to have to pay the cost of going from guest => host =>guest => host => guest for every PIO. The guest is incapable ofreenabling PG on its own hence the extra host => guest transition.

Compare to stub domain where, if done correctly, you can go from guest=> host/0 => host/3 => host/0 => guest. The question would be, ishost/0 => host/3 => host/0 fundamentally faster than host => guest => host.

I know that guest => host => guest typically costs *at least* 1000 nsecson SVM. A null sysenter syscall (that's host/3 => host/0 => host/3) isroughly 75 nsecs.

So my expectation is that stub domain can actually be made to be fasterthan reflecting.


Regards,

Anthony Liguori

You seem to be actually proposing running the code within the HVM guestitself. The two approaches aren't actually that different, IMO, since theguest still effectively has two different execution contexts. It does seemto me that running within the HVM guest itself might be more flexible.
A cool little trick that this strategy could enable is to run a full Qemuinstruction emulator within the device model - I'd imagine this could beuseful on IA64, for instance, in order to provide support for running legacyOSes (e.g. for x86, or *cough* PPC ;-))
Cheers,
Mark



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

Follow-Ups:
- Re: [Xen-devel] Re: Improving hvm IO performance by using self IO emulator (YA io-emu?)
  - From: Tristan Gingold
- Re: [Xen-devel] Re: Improving hvm IO performance by using self IO emulator (YA io-emu?)
  - From: Mark Williamson
- Re: [Xen-devel] Re: Improving hvm IO performance by using self IO emulator (YA io-emu?)
  - From: Alan

References:
- [Xen-devel] Improving hvm IO performance by using self IO emulator (YA io-emu?)
  - From: Tristan Gingold
- [Xen-devel] Re: Improving hvm IO performance by using self IO emulator (YA io-emu?)
  - From: Anthony Liguori
- [Xen-devel] Re: Improving hvm IO performance by using self IO emulator (YA io-emu?)
  - From: tgingold
- Re: [Xen-devel] Re: Improving hvm IO performance by using self IO emulator (YA io-emu?)
  - From: Mark Williamson

Prev by Date: Re: [Xen-devel] Re: Improving hvm IO performance by using self IO emulator (YA io-emu?)
Next by Date: Re: [Xen-devel] Re: Improving hvm IO performance by using self IO emulator (YA io-emu?)
Previous by thread: Re: [Xen-devel] Re: Improving hvm IO performance by using self IO emulator (YA io-emu?)
Next by thread: Re: [Xen-devel] Re: Improving hvm IO performance by using self IO emulator (YA io-emu?)
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.