Re: [Xen-devel] PV-vNUMA issue: topology is misinterpreted by the guest

On 07/22/2015 04:44 PM, Boris Ostrovsky wrote:
On 07/22/2015 10:09 AM, Juergen Gross wrote:
On 07/22/2015 03:58 PM, Boris Ostrovsky wrote:
On 07/22/2015 09:50 AM, Juergen Gross wrote:
On 07/22/2015 03:36 PM, Dario Faggioli wrote:
On Tue, 2015-07-21 at 16:00 -0400, Boris Ostrovsky wrote:
On 07/20/2015 10:43 AM, Boris Ostrovsky wrote:
On 07/20/2015 10:09 AM, Dario Faggioli wrote:

I'll need to see how LLC IDs are calculated, probably also from some
CPUID bits.

No, can't do this: LLC is calculated from CPUID leaf 4 (on Intel)
use indexes in ECX register and xl syntax doesn't allow you to
CPUIDs for such leaves.

Right. Which leaves us with the question of what should we do and/or
recommend users to do?

If there were a workaround that we could put in place, and document
somewhere, however tricky it was, I'd say to go for it, and call it
acceptable for now.

But, if there isn't, should we disable PV vnuma, or warn the user that
he may see issues? Can we identify, in Xen or in toolstack, whether an
host topology will be problematic, and disable/warn in those cases

I'm not sure, honestly. Disabling looks too aggressive, but it's an
issue I wouldn't like an user to be facing, without at least being
informed of the possibility... so, perhaps a (set of) warning(s)?

I think we have 2 possible solutions:

1. Try to handle this all in the hypervisor via CPUID mangling.

2. Add PV-topology support to the guest and indicate this capability
   elfnote; only enable PV-numa if this note is present.

I'd prefer the second solution. If you are okay with this, I'd try
to do
some patches for the pvops kernel.

Why do you think that kernel patches are preferable to CPUID management?
This would be all in tools, I'd think. (Well, one problem that I can
think of is that AMD sometimes pokes at MSRs and/or Northbridge's PCI
registers to figure out nodeID --- that we may need to have to address
in the hypervisor)

Doing it via CPUID is more HW specific. Trying to fake a topology for
the guest from outside might lead to weird decisions in the guest e.g.
regarding licenses based on socket counts.

If you are doing it in the guest itself you are able to address the
different problems (scheduling, licensing) in different ways.

And those patches won't help HVM guests, will they? How would they be
useful by user processes?

HVM can use pv interfaces as well. It's called pv-NUMA :-)

Hmm, I didn't think of user processes. Are you aware of cases where they
are to be considered? The only case where user processes are involved I
could think of is licensing again. Depending on the licensing model
playing with CPUID is either good or bad. I can even imagine the CPUID
configuration capabilities in xl are in use today for exactly this
purpose. Using them for pv-NUMA as well will make this feature unusable
for those users.


