[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] PV-vNUMA issue: topology is misinterpreted by the guest

On 07/27/2015 04:51 PM, Boris Ostrovsky wrote:
On 07/27/2015 10:43 AM, Juergen Gross wrote:
On 07/27/2015 04:34 PM, Boris Ostrovsky wrote:
On 07/27/2015 10:09 AM, Dario Faggioli wrote:
On Fri, 2015-07-24 at 18:10 +0200, Juergen Gross wrote:
On 07/24/2015 05:58 PM, Dario Faggioli wrote:
So, just to check if I'm understanding is correct: you'd like to
add an
abstraction layer, in Linux, like in generic (or, perhaps,
code, to hide the direct interaction with CPUID.
Such layer, on baremetal, would just read CPUID while, on PV-ops,
check with Xen/match vNUMA/whatever... Is this that you are saying?
Sort of, yes.

I just wouldn't add it, as it is already existing (more or less). It
can deal right now with AMD and Intel, we would "just" have to add

So, having gone through the rest of the thread (so far), and having
given a fair amount o thinking to this, I really think that something
like this would be a good thing to have in Linux.

Of course, it's not that my opinion on where should be in Linux counts
that much! :-D   Nevertheless, I wanted to make it clear that, while
skeptic at the beginning, I now think this is (part of) the way to go,
as I said and explained in my reply to George.

And I continue to believe that kernel solution does not address the
userland problem which is no less important than making kernel do proper
scheduling decisions (and I suspect when this patch goes for review
that's what the scheduling people are going to say).

Remember the original problem that started this thread was that kernel
complained that topology didn't make sense and it turned off all
topology-related decisions. Which means that kernel already has a
solution for weird topology. Some enumeration doesn't trigger this
warning, but we can come up with one that does. Or we can indeed have a
patch in kernel that will, possibly silently, fail topology_sane() when
virtualized and not pinned.

How would you come up with a topology the kernel is complaining about
and user mode scheduling will use for sane decisions ?

We need to understand first why Dario's box is apparently the only one
resulting in a warning and probably then emulate that enumeration.

This will lead to other problems in user land e.g. with hwloc.

And again, if that is not possible then just make topology_sane() fail.

And again: once you claim that kernel mode isn't everything and here
you fail to respect possible user land requirements.

(This is what I assume kernel does when topology_sane() fails. And if it
doesn't, that's a bug IMO)

The licensing problem that Juergen described can be solved by pining
vcpus and exposing HT bit. Besides,  creating a guest with 24 VPCUs and

Hmm, yes. This way you sacrifice most of the virtualization advantages.

hoping that 16-core licensing will work I think is pushing it a bit when
you know that VCPUs will jump around cores (i.e. "on average" you are
running on more than 16 cores -- multi-threaded or not -- which arguably
is what licensing is trying to prevent)

On a machine with only 16 cores running on more than 16 cores? I have
some problems to believe this. The point was: if the license is happy on
bare metal it should be so when running on the same hardware as a guest.

Ok, that's not how I should have described it. I meant that IMO asking
for 24 VCPUs is somewhat akin to oversubscribing since you kind of know
that you dont' have 24 PCPUs, you are just trying to fool the kernel
into thinking that threads are cores.

/proc/cpuinfo on bare metal will list 32 cpus. xl info in dom0 will list
32 cpus. You have 32 entities where you can do scheduling. So what's the
problem having a domU with 24 vcpus? There are still 8 pcpus free for
e.g. dom0 then.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.