[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Re: Improving hvm IO performance by using self IO emulator (YA io-emu?)

Mark Williamson wrote:
The big problem with disk emulation isn't IO latency, but the fact that
the IDE emulation can only have one outstanding request at a time.  The
SCSI emulation helps this a lot.
IIRC, a real IDE can only have one outstanding request too (this may have
changed with AHCI).  This is really IIRC :-(

Can SATA drives queue multiple outstanding requests? Thought some newer rev could, but I may well be misremembering - in any case we'd want something that was well supported.

SATA can, yes.  However, as you mention, SATA is very poorly supported.

The LSI scsi adapter seems to work quite nicely with Windows and Linux. And it supports TCQ. And it's already implemented :-) Can't really beat that :-)

I don't know what the bottle neck is in network emulation, but I suspect
the number of copies we have in the path has a great deal to do with it.
This reason seems obvious.

Latency may matter more to the network performance than it did to block, actually (especially given our current setup is fairly pessimal wrt latency!). It would be interesting to see how much difference this makes.

In any case, copies are bad too :-) Presumably, hooking directly into the paravirt network channel would improve this situation too.

Perhaps the network device ought to be the first to move?

Can't say.  I haven't done much research on network performance.

There's a lot to like about this sort of approach.  It's not a silver
bullet wrt performance but I think the model is elegant in many ways.
An interesting place to start would be lapic/pit emulation.  Removing
this code from the hypervisor would be pretty useful and there is no
need to address PV-on-HVM issues.
Indeed this is the simpler code to move.  But why would it be useful ?

It might be a good proof of concept, and it simplifies the hypervisor (and the migration / suspend process) at the same time.

Does the firmware get loaded as an option ROM or is it a special portion
of guest memory that isn't normally reachable?
IMHO it should come with hvmload.  No needs to make it unreachable.

Mmmm. It's not like the guest can break security if it tampers with the device models in its own memory space.

Question: how does this compare with using a "stub domain" to run the device models? The previous proposed approach was to automatically switch to the stub domain on trapping an IO by the HVM guest, and have that stub domain run the device models, etc.

Reflecting is a bit more expensive than doing a stub domain. There is no way to wire up the VMEXITs to go directly into the guest so you're always going to have to pay the cost of going from guest => host => guest => host => guest for every PIO. The guest is incapable of reenabling PG on its own hence the extra host => guest transition.

Compare to stub domain where, if done correctly, you can go from guest => host/0 => host/3 => host/0 => guest. The question would be, is host/0 => host/3 => host/0 fundamentally faster than host => guest => host.

I know that guest => host => guest typically costs *at least* 1000 nsecs on SVM. A null sysenter syscall (that's host/3 => host/0 => host/3) is roughly 75 nsecs.

So my expectation is that stub domain can actually be made to be faster than reflecting.


Anthony Liguori

You seem to be actually proposing running the code within the HVM guest itself. The two approaches aren't actually that different, IMO, since the guest still effectively has two different execution contexts. It does seem to me that running within the HVM guest itself might be more flexible.

A cool little trick that this strategy could enable is to run a full Qemu instruction emulator within the device model - I'd imagine this could be useful on IA64, for instance, in order to provide support for running legacy OSes (e.g. for x86, or *cough* PPC ;-))


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.