[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] QEMU XenServer/XenProject Working group meeting 29th September 2016



> On 18 Oct 2016, at 20:54, Stefano Stabellini <sstabellini@xxxxxxxxxx> wrote:
> 
> I think this kind of calls should be announced on xen-devel before they
> happen, to give a chance to other people to participate (I cannot
> promise I would have participated but it is the principle that counts).
> 
> If I missed the announcement, I apologize.

Stefano, the meeting started off as an internal meeting to brainstorm and share 
experiences and challenges we have with QEMU amongst different Citrix teams 
with a view to get a wider dialog started. Maybe we are at the stage where it 
makes sense to open it up. 

> On Fri, 14 Oct 2016, Jennifer Herbert wrote:
>> XenStore
>> --------
>> 
>> For the non-pv part of QEMU, XenStore is only used in two places.
>> There is the DM state, and the physmap mechanism.  Although there is a
>> vague plan for replacing the physmap mechanism, it is some way off.
>> 
>> The DM state key is used for knowing when the qemu process is running
>> etcetera, QMP would seem to be an option to replace it - however there
>> is no (nice) way to wait on a socket until it has been opened.  One
>> solution might be to use Xenstore to let you know the QMP sockets
>> where available, before QEMU drops privileges,  and then QMP could be
>> used to know QEMU is in the running state.
>> 
>> To avoid the need to use xs-restrict, you would need to both replace
>> physmap and rework qemu startup procedure. The use of xs-restrict would
>> be more expedient, and does not look to need that much work.
>> 
>> Discussion was had over how secure it would be to allow a guest access
>> to these Xenstore keys - it was concluded that a guest could mostly
>> only mess itself up.  If I guest attempted to prevent itself from being
>> migrated, the tool stack time it out, and could kill it.
>> 
>> There followed a discussion on the Xenbus protocol, and additions
>> needed.  The aim is to merely restrict the permission for the command,
>> to that of the guest who's domID you provide.  It was proposed that
>> it uses the header as is, with its  16 bytes, with the command
>> 'one-time-restrict' , and then the payload would have two additional
>> field at the start.  These two field would correspond to the domid to
>> restrict as, and the real command. Transaction ID and tags would be
>> taken from the real header.
>> 
>> Although inter domain xs-restrict is not specifically needed for this
>> project, it is thought it might be a blocking items for upstream
>> acceptance.  It it thoughts these changes would not require that much
>> work to implement, and may be useful in use use cases. Only a few
>> changes to QEMU would be needed, and libxl should be able to track
>> QEMU versions.  Ian Jackson volunteered to look at this, with David
>> helping  with the kernel bits.  Ian won't have time to look at this
>> until after Xen 4.8 is released.
>> 
>> There discussion about what may fail once privileges are taken away,
>> which would include CDs and PCI pass though.  It is thought the full
>> list can only be known by trying.  Not everything needs to work for
>> acceptance upstream, such as PCI pass though.   If such an
>> incompatible feature is needed, restrictions can be turned off.  These
>> problems can be fixed in a later phase, with CDs likely being at teh
>> top of the list.
> 
> One thing to note is that xs-restrict is unimplemented in cxenstored.
> 
> 
>> disaggregation
>> =============
>> 
>> A disaggregation proposal which had previously been posted to a QEMU
>> forum was discussed.  It was not previously accepted by all. The big
>> question was how to separate the device models from the machine, with
>> a particular point of contention being around PIIX and the idea of
>> starting a QEMU instance without one.
> 
> Right. In particular I tend to agree with the other QEMU maintainers
> when they say: why ask for a PIIX3 compatible machine, when actually you
> don't want to be PIIX3 compatible?
> 
> 
>> The general desire from us is
>> we want to have a specific device emulated and nothing else.
> 
> This is really not possible with QEMU, because QEMU is a machine
> emulator, not a device emulator. BTW who wants this? I mean, why is this
> part of the QEMU depriv discussion? It is not necessary. I think what we
> want for QEMU depriv is to be able to build a QEMU PV machine with just
> the PV backends in it, which is attainable with the current
> architecture. I know there are use cases for having an emulator of just
> one device, but I don't think they should be confused with the more
> important underlying issue here, which is QEMU running with full
> privileges.
> 
> 
>> It is
>> suggested you would have a software interface between each device that
>> looked a software version of PCI.  The PIIX device could be attached to
>> CPU this pseudo PCI interface.  This would fit in well with how IOREQ
>> server and IOMMU works.  Although this sounds like a large
>> architectural change is wanted, its suggested that actually its just
>> that we're asking them to take a different stability and plug-ability
>> posture on the interfaces they already have.
>> 
>> This architectural issue is the cause behind lots of little
>> annoyances, which have been going on for years. Xen is having to make
>> up lots of strange stuff to keep QEMU happy, and there is confusion
>> over memory ownership.  Fixing the architecture  should make our lives
>> much easier.  These architectural issues are also making things
>> difficult for Intel, who are trying to work around the issue with Xen
>> changes, which may just worsen the problem.  This means this is
>> effectively blocking them.
>> 
>> It is proposed that instead of having a QEMU binary, what is really
>> wanted is a QEMU library.  With a library you could easily take the
>> bits needed, create your own main loop and link them to whatever
>> interface, IOREQ services or IPC mechanism is needed. There would be
>> no longer be a need for the IOREQ server to be in QEMU, which is
>> thought should be an attractive idea for the QEMU maintainers.  It is
>> also thought that other projects, such as the clear containers people
>> would also benefit from such an architecture.  The idea of spiltting
>> out the CPU code from the device code may even be attractive to KVM.
> 
> The idea of having a QEMU library has always been resisted upstream. It
> takes the project in a very different direction. As QEMU maintainer I
> don't know if such a thing would actually be good for the QEMU
> community.

We revisited the original disaggregation thread (Wei originally proposed the 
patches) and what we proposed at the time was a sort of a half-way house that 
was very Xen specific and not really of much use to anyone other QEMU 
downstream but Xen. Even then, opinions amongst QEMU maintainers were divided: 
some were in favour, some were not. But we would definitely need to make a good 
case, do some convincing upfront and address the concerns of the QEMU community 
and work with the QEMU maintainers from the get-go. As you rightly point out, 
such an approach does change some of the fundamental assumptions within QEMU 
and we wouldn't want to do this, if there are no benefits to QEMU. I think it 
is worthwhile trying this again. You may have some further insights, which 
would be quite valuable. 

Regards
Lars

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.