[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Re: Improving hvm IO performance by using self IO emulator (YA io-emu?)



While I'm thinking about it, I wonder how returning to the guest from the 
emulator would work...

We'd want to hypercall to transfer back to it...  do we need specific Xen 
support for this or could (for instance) Gerd's work on domU kexec be 
leveraged here?

Perhaps it would be worth evaluating some kind of "send these events and then 
switch back to guest code" hypercall so that the emulator doesn't have to 
bounce in and out of Xen so much.  Remains to be seen whether this makes much 
diffecence to overall performance but it seems somehow civilised ;-)

Cheers,
Mark

On Thursday 22 February 2007 21:23, Anthony Liguori wrote:
> tgingold@xxxxxxx wrote:
> >> KVM has a pretty much optimal path from the kernel to userspace.  The
> >> overhead of going to userspace is roughly two syscalls (and we've
> >> measured this overhead).  Yet it makes almost no difference in IO
> >> throughput.
> >
> > The path can be split into 2 parts: from trap to ioemu and from ioemu to
> > real hardware (the return is the same).  ioemu to hardware should be
> > roughly the same with KVM and Xen.  Is trap to ioemu that different
> > between Xen and KVM ?
>
> Yup.  With KVM, there is no scheduler involvement.  qemu does a blocking
> ioctl to the Linux kernel, and the Linux kernel does a vmrun.  Provided
> the time slice hasn't been exhausted, Linux returns directly to qemu
> after a vmexit.
>
> Xen uses event channels which involved domain switches and
> select()'ing.  A lot of the time, the path is pretty optimal.  However,
> quite a bit of the time, you run into worst case scenarios with the
> various schedulers and the latency sky rockets.
>
> > Honestly I don't know.  Does anyone have figures ?
>
> Yeah, it varies a lot on different hardware.  For reference:
>
> if round trip to a null int80 syscall is 150 nsec, a round trip vmexit
> to userspace in KVM may be 2500 nsec.  On bare metal, it may cost 1700
> nsec to do a PIO operation to a IDE port so 2500 really isn't that bad.
>
> Xen is usually around there too but every so often, it spikes to
> something awful (100ks of nsecs) and that skews the average cost.
>
> > It would be interesting to compare disk (or net) performances between:
> > * linux
> > * dom0
> > * driver domain
> > * PV-on-HVM drivers
> > * ioemu
> >
> > Does such a comparaison exist ?
>
> Not that I know of.  I've done a lot of benchmarking but not of PV-on-HVM.
>
> Xen can typically get pretty close to native for disk IO.
>
> >> The big problem with disk emulation isn't IO latency, but the fact that
> >> the IDE emulation can only have one outstanding request at a time.  The
> >> SCSI emulation helps this a lot.
> >
> > IIRC, a real IDE can only have one outstanding request too (this may have
> > changed with AHCI).  This is really IIRC :-(
>
> You recall correctly.  IDE can only have one type of outstanding DMA
> request.
>
> > BTW on ia64 there is no REP IN/OUT.  When Windows use IDE in PIO mode
> > (during install and crash dump), performances are horrible.  There is a
> > patch which adds a special handling for PIO mode and really improve data
> > rate.
>
> Ouch :-(  Fortunately, OS's won't use PIO very often.
>
> >> I don't know what the bottle neck is in network emulation, but I suspect
> >> the number of copies we have in the path has a great deal to do with it.
> >
> > This reason seems obvious.
> >
> >
> > [...]
> >
> >> There's a lot to like about this sort of approach.  It's not a silver
> >> bullet wrt performance but I think the model is elegant in many ways.
> >> An interesting place to start would be lapic/pit emulation.  Removing
> >> this code from the hypervisor would be pretty useful and there is no
> >> need to address PV-on-HVM issues.
> >
> > Indeed this is the simpler code to move.  But why would it be useful ?
>
> Removing code from the hypervisor reduces the TCB so it's a win.  Having
> it in firmware within the HVM domain is even better than having it in
> dom0 too wrt the TCB.
>
> >> Can you provide more details on how the reflecting works?  Have you
> >> measured the cost of reflection?  Do you just setup a page table that
> >> maps physical memory 1-1 and then reenter the guest?
> >
> > Yes, set disable PG, set up flat mode and reenter the guest.
> > Cost not yet measured!
>
> That would be very useful to measure.  My chief concern would be that
> disabling PG would be considerably more costly than entering with paging
> enabled.  That may not be the case on VT today since there is no ASIDs
> so it would be useful to test on SVM too.
>
> >> Does the firmware get loaded as an option ROM or is it a special portion
> >> of guest memory that isn't normally reachable?
> >
> > IMHO it should come with hvmload.  No needs to make it unreachable.
>
> It would be nice to get rid of hvmloader in the long term IMHO.  Any
> initialization should be done in the BIOS.
>
> Regards,
>
> Anthony Liguori
>
> > Tristan.
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel

-- 
Dave: Just a question. What use is a unicyle with no seat?  And no pedals!
Mark: To answer a question with a question: What use is a skateboard?
Dave: Skateboards have wheels.
Mark: My wheel has a wheel!

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.