Re: [Xen-devel] [PATCH v5 1/4] xen/libxc: Allow changes to hypervisor CPUID leaf from config file

On 03/20/2014 05:25 AM, Ian Campbell wrote:
On Wed, 2014-03-19 at 10:41 -0400, Boris Ostrovsky wrote:
On 03/19/2014 05:27 AM, Ian Campbell wrote:
On Tue, 2014-03-18 at 20:58 -0400, Boris Ostrovsky wrote:
Currently only "real" cpuid leaves can be overwritten by users via
'cpuid' option in the configuration file. This patch provides ability to
do the same for hypervisor leaves (but for now only 0x40000000 is allowed).

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>
   tools/libxc/xc_cpuid_x86.c   |   71 
   xen/arch/x86/domain.c        |   19 +++++++++--
   xen/arch/x86/traps.c         |    3 ++
   xen/include/asm-x86/domain.h |    7 +++++
   4 files changed, 95 insertions(+), 5 deletions(-)

diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
index bbbf9b8..5501d5b 100644
--- a/tools/libxc/xc_cpuid_x86.c
+++ b/tools/libxc/xc_cpuid_x86.c
@@ -33,6 +33,8 @@
   #define DEF_MAX_INTELEXT  0x80000008u
   #define DEF_MAX_AMDEXT    0x8000001cu
+#define IS_HYPERVISOR_LEAF(idx) (((idx) & 0xffff0000) == 0x40000000)
Not idx == 0x40000000?

Also as I think Jan said before if viridian support is enabled then the
Xen leaves may be elsewhere (at 0x100 increments above that address

   static int hypervisor_is_64bit(xc_interface *xch)
       xen_capabilities_info_t xen_caps = "";
@@ -43,22 +45,31 @@ static int hypervisor_is_64bit(xc_interface *xch)
   static void cpuid(const unsigned int *input, unsigned int *regs)
       unsigned int count = (input[1] == XEN_CPUID_INPUT_UNUSED) ? 0 : input[1];
+    uint8_t is_hyp = IS_HYPERVISOR_LEAF(input[0]);
   #ifdef __i386__
       /* Use the stack to avoid reg constraint failures with some gcc flags */
       asm (
           "push %%ebx; push %%edx\n\t"
+        "testb $0xff,%5\n\t"
+        "jz .Lcpuid%=\n\t"
+        ".Lcpuid%=:\n\t"
           "mov %%ebx,4(%4)\n\t"
           "mov %%edx,12(%4)\n\t"
           "pop %%edx; pop %%ebx\n\t"
           : "=a" (regs[0]), "=c" (regs[2])
-        : "0" (input[0]), "1" (count), "S" (regs)
+        : "0" (input[0]), "1" (count), "S" (regs), "q" (is_hyp)
I think this would be clearer refactored into make_real_cpuid() and
make_pv_cpuid() functions.
Would a comment explaining why we do it this way be sufficient or do you
really want to split this into two routines?
I think splitting would be clearer, just by virtue of being able to give
the functions comprehensible names.

Actually, now that I hear both of you arguing for only being able to change the max number of hypervisor leaves I think I'll drop pretty much all changes to libxc and pass user's values directly to the hypervisor. And the check there will again be specific for 0x40000x00.eax[7:0], the rest will be Xen's default and user's request will be ignored.

I was aiming for a more generic support for changes to hypervisor leaves but since there is no interest in this (as there is no real need right now) we can come back to this when the need arises.

  (And I assume you meant
make_hv_cpuid, not make_pv_cpuid.)
I meant pv -- XEN_EMUALTE_PREFIX+cpuid instr is a "pv cpuid", isn't it?
It's not clear what "hypervisor cpuid" would be -- is it the cpuid which
the hypervisor sees (i.e. real) or is it some fabrication, in which case
how does it depend on the context (i.e. with which guest is it with
respect too).

I was thinking this would be used for hypervisor leaves only but you are right, this is a pv cpuid.

It also seems strange to use emulated for a subset of leafs, although I
understand why.

How does this play out in e.g. a PVH toolstack domain where the even
"real" cpuid might be faked?
It shouldn't matter what the guest it, the hypervisor leaves are
Except when you've change them for a guest using the functionality you
are adding here, surely?

Yes, what I meant was that changing hypervisor leaves will work for all types of guests.

I think.

Perhaps we should have a hypercall to retrieve the complete set of real
h/w, levelled h/w, pv, emulated etc values for a given leaf?
That's what I was thinking about (except for leveled values) when I
implemented sysctl in the early version of this series but then Andrew
pointed out that for what I need prefixed cpuid was sufficient, so I
went that route.
Hrm, it may be sufficient, but is it a good interface?

Not as it was written --- it was for a single leaf and not for the whole set in one call. But something along those lines.

@@ -726,6 +740,57 @@ int xc_cpuid_check(
       return rc;
+static void xc_cpuid_revert(unsigned int *reg, unsigned int polreg,
+    unsigned int start, unsigned int end, char *config)
+static void xc_cpuid_constrain(const unsigned int *input, unsigned int *regs,
+    unsigned int *polregs, char **config_transformed)
These complicated functions most certainly need some sort of explanation
about what they are trying to do.
They will be gone since the functionality will be implemented in the
OK good, although if they reappear in hypervisor context I suppose my
comments will still stand.

They won't look like they were. As I said above, it will be a simple, targeted test.


