[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] [Xen Hackathon] VirtIO and Xen - minutes


Below you could find minutes taken by Konrad (thanks!) during
VirtIO and Xen session which happened on Xen Hackathon.
I made some minor fixes/cleanups. Session was led by me.


Introduction - Daniel is part of OASIS, which works on VirtIO specification -
which is based on the specification that was prepared for Rusty.

Daniel would like to talk about this as there is no solution
for some of the problems with using this under Xen.

Wei Liu - in HVM it worked out of the box. QEMU has backend it works - QEMU
does the mapping of the guest (aka mapcache). When there is a fault, it
maps and there is a bucket. At some point it will unmap when the bucket is full.

The committee decided to drop all PV support (was there any in the spec) -
not even lguest support.

Four questions:
 - What bus do we want to use access for VirtIO devices - XenBus or PCI
 - What do we want for DMAs in the ring - PFNs or other (grant references).
   Grant references will fix the issue of the QEMU having to have full
   access to the guest
 - One idea is to pass in an index to a pool (instead of PFNs) and that maps
   to the persistent grants.
   Doing grant copy does eat a lot of CPU cycles, and doing persistent grants
   seems to do the job.
 - Which devices do we want to support first.

Do we want to even do it? One usage is to V2V.
VirtIO devices that don't exist in Xen (virtio_rng, virtio_serial). Taking
the advantage of these. Potentially a virtio_gpu (it is worked on).

There is no problem to add the extra buses - we can add the XenBus.
It is a considerate amount of work. We could create a fake PCI bus.
Does the spec define that commands have to be synch or async?

The configuration space is only used for setup - the balloon driver
uses it as well to change the target.

Rusty said he does not object for putting any bus. Coming up with a new
bus v2 is hard - if we do something sensible - it would be accepted.

Do we want to reuse the backends instead of implementing it. We want to
reuse it. We can re-use QEMU and just use the backend code without the
PCI trapping and such. If we want to have QEMU only expose one PCI device
we could do it as well.

We would have to reengineer on both sides - QEMU and frontend.

1) VirtIO MMIO could be used - s390 uses.
   In PVH we could use ioreq to transfer just the page to another domain.
   We would get the synchronous it with a bit of work.

1a). The XenBus could be used - the objections were that is hard
   to maintain. However the existing situation is that it has not
   changed in years.

2) QEMU backends would have to be split up - and there would two (or more)
   configuration fronts that would configure it.

The VirtIO ring uses PFNs on the ring. VirtIO has full access to the
guest. We don't want to do from the security stand point (seeing the
whole guest). We could translate the grant references to something that
the guest and host can understand. The DMA API in Linux kernel could do
the translation (PFN -> grant references).

If we want to support other OSes, say Windows. We would have a problem
with the DMA API (which would still have to be added). We could do:
 1) Grant mapping individual page
 2) Or bounce buffer to a persistent page.

If we have something in the spec, we need something generic that other
OSes can use. So we need to prepare something common.

Batched grant maps - We can solve some of the grant issues, TLB flushing,
etc - can be fixed.

It boils down to memcpy pages vs VMEXIT (batches grant map or batched grant 

VMEXIT latency lower than 4KB pages.

Doing grant mapping / grant mapping is unlikely to be faster than memcpy.

We seem to be convinced than the memcpy/persistent grants seems to be
faster and better.

Without the vhost accel it won't be fast.
Without the vhost, the datapath is virtio -> qemu -> tap. It is bypassing QEMU.

vhost could be modified to use this pool of memory (map it) and pluck
the bytes from it as it needs.

What drivers we want to support. If you have that transfer layer, everything 

Start with virtio-console.

Would it make sense to do a mock-up (prototype). We would want to prototype
before we go with the specification.

Daniel can do it, but it would have to be deferred to do after EFI.
Could we chop of the places for GSoC or for OPW in the Winter?

The work-items would be:
 1) Chop up the pieces to small nice chunks
 2) Prototype work of said pieces (interns, OPW GSoC, Citrix interns, Oracle
    interns), community.
 3) Once all done, draft for OASIS VirtIO group.


I am going to coordinate all work according to above findings.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.