[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v4 14/14] xen/x86: setup PVHv2 Dom0 ACPI tables



On Thu, Dec 22, 2016 at 01:19:36PM -0500, Boris Ostrovsky wrote:
> On 12/22/2016 11:44 AM, Roger Pau Monne wrote:
> > On Thu, Dec 22, 2016 at 09:24:02AM -0700, Jan Beulich wrote:
> >>>>> On 22.12.16 at 17:17, <boris.ostrovsky@xxxxxxxxxx> wrote:
> >>> On 12/22/2016 07:17 AM, Roger Pau Monne wrote:
> >>>> Maybe Boris has some ideas about how to do CPU hotplug for Dom0?
> >>> Why would Xen need to be able to parse the AML that is intended to be
> >>> executed by dom0? I'd think that all the hypervisor would need to do is
> >>> to load it into memory, not any different from how it's done for regular
> >>> guests.
> >> Well, if Dom0 executed the unmodified _MAT, it would get back
> >> data relating to the physical CPU. Roger is overriding MADT to
> >> present virtual CPU information to Dom0, and this would likewise
> >> need to happen for the _MAT return value.
> 
> By "unmodified _MAT" you mean system's _MAT?  Why can't we provide our
> own that will return _MAT object properly adjusted for dom0? We are
> going to provide our own (i.e. not system's) DSDT, aren't we?

Providing Dom0 with a different DSDT is almost impossible from a Xen PoV, for
once Xen cannot parse the original DSDT (because it's a dynamic table), and
then if we would be to provide a modified DSDT, we would also need an asl
assembler, so that we could parse the DSDT, modify it, and then compile it
again in order to provide it to Dom0.

Although all this code could be put under an init section that would be freed
after Dom0 creation it seems overkill and very far from trivial, not to mention
that I'm not even sure what side-effects there would be if Xen parsed the DSDT
itself without having any drivers.

> > This is one of the problems with this Dom/Xen0 split brain problem that we
> > have, and cannot get away from.
> >
> > To clarify a little bit, I'm not overwriting the original MADT in memory, so
> > Dom0 should still be able to access it if the implementation of _MAT returns
> > data from that area. AFAICT when plugging in a physical CPU (pCPU) into the
> > hardware, Dom0 should see "correct" data returned by the _MAT method. 
> > However
> > (as represented by the " I've used), this data will not match Dom0 vCPU
> > topology, and should not be used by Dom0 (only reported to Xen in order to
> > bring up the new pCPU).
> 
> So the problem seems to be that we need to run both _MATs --- system's
> and dom0's.

Exactly, we need _MAT for pCPUs and _MAT for _vCPUs.

> > Then the problem arises because we have no way to perform vCPU hotplug for
> > Dom0, not at least as it is done for DomU (using ACPI), so we would have to 
> > use
> > an out-of-band method in order to do vCPU hotplug for Dom0, which is a PITA.
> 
> 
> I would very much like to avoid this.

Maybe we can provide an extra SSDT for Dom0 that basically overwrites the CPU
objects (_PR.CPUX), but I'm not sure if ACPI allows this kind of objects
overwrites?

After reading the spec, I came across the following note in the SSDT section:

"Additional tables can only add data; they cannot overwrite data from previous
tables."

So I guess this is a no-go.

I only see the following options:

 - Prevent Dom0 from using the original _MAT methods (or even the full _PR.CPU
   objects) using the STAO, and then provide Dom0 with an out-of-band method
   (ie: not ACPI) in order to do CPU hotplug.

 - Expand the STAO so that it can be used to override ACPI namespace objects,
   possibly by adding a payload field that contains aml code. It seems that
   Linux already supports overwriting part of the ACPI namespace from
   user-space[0], so part of the needed machinery seem to be already in place
   (hopefully in acpica code?).

 - Disable the native CPU objects in the DSDT/SSDT using the STAO. Then pick up
   unused ACPI CPU IDs and use those for vCPUs. Provide an additional SSDT that
   contains ACPI objects for those vCPUs (as is done for DomU). This means we
   would probably have to start using x2APIC entries in the MADT, since the
   CPUs IDs might easily expand past 255 (AFAICT we could still keep the APIC
   IDs low however, since those two are disjoint).

I don't really fancy any of these two options, probably the last one seems like
the best IMHO, but I would like to hear some feedback about them, and of course
I'm open to suggestions :).

Roger.

[0] https://www.kernel.org/doc/Documentation/acpi/method-customizing.txt


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.