[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] PV-vNUMA issue: topology is misinterpreted by the guest

On 07/22/2015 10:09 AM, Juergen Gross wrote:
On 07/22/2015 03:58 PM, Boris Ostrovsky wrote:
On 07/22/2015 09:50 AM, Juergen Gross wrote:
On 07/22/2015 03:36 PM, Dario Faggioli wrote:
On Tue, 2015-07-21 at 16:00 -0400, Boris Ostrovsky wrote:
On 07/20/2015 10:43 AM, Boris Ostrovsky wrote:
On 07/20/2015 10:09 AM, Dario Faggioli wrote:

I'll need to see how LLC IDs are calculated, probably also from some
CPUID bits.

No, can't do this: LLC is calculated from CPUID leaf 4 (on Intel) which use indexes in ECX register and xl syntax doesn't allow you to override
CPUIDs for such leaves.

Right. Which leaves us with the question of what should we do and/or
recommend users to do?

If there were a workaround that we could put in place, and document
somewhere, however tricky it was, I'd say to go for it, and call it
acceptable for now.

But, if there isn't, should we disable PV vnuma, or warn the user that
he may see issues? Can we identify, in Xen or in toolstack, whether an
host topology will be problematic, and disable/warn in those cases too?

I'm not sure, honestly. Disabling looks too aggressive, but it's an
issue I wouldn't like an user to be facing, without at least being
informed of the possibility... so, perhaps a (set of) warning(s)?

I think we have 2 possible solutions:

1. Try to handle this all in the hypervisor via CPUID mangling.

2. Add PV-topology support to the guest and indicate this capability via
   elfnote; only enable PV-numa if this note is present.

I'd prefer the second solution. If you are okay with this, I'd try to do
some patches for the pvops kernel.

Why do you think that kernel patches are preferable to CPUID management? This would be all in tools, I'd think. (Well, one problem that I can think of is that AMD sometimes pokes at MSRs and/or Northbridge's PCI registers to figure out nodeID --- that we may need to have to address in the hypervisor)

And those patches won't help HVM guests, will they? How would they be useful by user processes?


What if I configure a guest to follow HW topology? I.e. I pin VCPUs to
appropriate cores/threads? With elfnote I am stuck with disabled topology.

Add an option to do exactly that: follow HW topology (pin vcpus,
configure vnuma)?

Add a force flag to the vnuma configuration to ignore the elfnote?

Besides, this is not necessarily a NUMA-only issue, it's a scheduling
one (inside the guest) as well.

Sure. That's what Jan said regarding SUSE's xen-kernel. No toplogy info
(or a trivial one) might be better than the wrong one...

This patch for pvops should be written in any case. I'll do this, but it
would be nice to know whether PV-numa should be considered or not.

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.