[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] HVM CPU enumeration, mapping to VCPU ID (Was: Re: [Xen-users] FreeBSD PVHVM call for testing)



On Mon, Jun 03, 2013 at 10:44:43AM -0400, Konrad Rzeszutek Wilk wrote:
> On Fri, May 31, 2013 at 11:53:22AM -0700, Matt Wilson wrote:
> > On Fri, May 31, 2013 at 09:21:50AM +0100, Ian Campbell wrote:
> > > On Thu, 2013-05-30 at 10:16 -0700, Matt Wilson wrote:
> > > > 
> > > > On bare metal x86 Linux, the kernel enumerates CPUs based on an order
> > > > defined by the BIOS.
> > > >  Typically this means that all the cores are
> > > > enumerated first, followed by logical processors (HT/SMT). For Linux,
> > > > maxcpus=N/2 should disable HT on systems that enumerate processors in
> > > > the recommended order. Some history:
> > > >   https://bugzilla.kernel.org/show_bug.cgi?id=2317
> > > 
> > > How the guest chooses to enumerate the CPUs is not terribly relevant so
> > > long as the Xen specific code for that OS knows how to invert that
> > > mapping to get at the underlying ABI which determines Xen's VCPUID for a
> > > CPU.
> > 
> > Indeed.
> > 
> > > I think I was wrong to focus on the guest enumeration scheme before,
> > > what actually matters is where in our ABI we expose the VCPUID, which
> > > isn't at all clear to me.
> > 
> > Agreed.
> > 
> > > > The virtual BIOS provides both ACPI tables and a legacy MP-table that
> > > > gives the LAPIC id mapping. The guest could infer the Xen vCPU ID from
> > > > a processor's position in these tables.
> > > 
> > > Do we consider the ordering given in any of those tables to be an HVM
> > > guest ABI? What about the lapic_id == 2*vcpuid -- is that multiplication
> > > factor part of the ABI (i.e. is the guest expected to pass lapic_id/2 to
> > > vcpuop)?
> > 
> > I strongly prefer the order in the BIOS tables, *not* the
> > lapic_id = 2*vcpuid formula. Once I've done some libxl work, I'll be
> > proposing a patch that makes the LAPIC / x2APIC IDs configurable,
> > and that will break this assumption.
> > 
> > > >  Or we could add a VCPUOP that an enlightened guest could use to get
> > > > the information more directly.
> > > 
> > > I'm hoping that there is some existing interface which I simply don't
> > > know about, but yes this could be the answer if such a thing doesn't
> > > exist.
> > 
> > I don't know of one that provide the information explicitly. It might
> > be easiest to provide this as a hypervisor CPUID leaf so it can be
> > used in early boot.
> > 
> > > > One question: why does a hypercall take a parameter that only has one
> > > > valid value? That value can be determined by looking at the current
> > > > running vCPU.
> > > 
> > > The generic prototype is:
> > >         vcpu_op(int cmd, int vcpuid, void *extra_args)
> > > Some cmds can act on any vcpuid and others can only act on the current
> > > vcpu. In an ideal world we would have had VCPUID_SELF or something but
> > > its a bit late for that.
> > 
> > Yea, that makes sense.
> > 
> > > > The *2 is just for assigning the LAPIC ID, and I'm pretty sure that
> > > > Linux is assigning processor IDs sequentially at ACPI parse time.
> > > 
> > > That probably doesn't matter, what matters is the Xen specific parts of
> > > the kernel's ability to reverse that assignment to get at the underlying
> > > APIC ID, assuming that is actually an ABI from which we can infer the
> > > VCPU ID...
> > 
> > Indeed. This seems to be loosely defined so far, and easy to get wrong
> > as happened with this FreeBSD work.
> > 
> > Konrad, Keir - any thoughts here?
> 
> I am a bit confused by 'I strongly prefer the order in the BIOS tables'.
> The way I understand it - Linux setup up the vCPUs based on the LAPIC
> which are created by the hvmloader. There are no hypercalls or any
> lapic_id =2*vcpuid formule in the Linux kernel. I presume what you meant
> by the lapic_id = 2 * vcpuid is more of this:
> 
> 144     for ( i = 0; i < nr_processor_objects; i++ )                          
>       
> 145     {                                                                     
>       
> 146         memset(lapic, 0, sizeof(*lapic));                                 
>       
> 147         lapic->type    = ACPI_PROCESSOR_LOCAL_APIC;                       
>       
> 148         lapic->length  = sizeof(*lapic);                                  
>       
> 149         /* Processor ID must match processor-object IDs in the DSDT. */   
>       
> 150         lapic->acpi_processor_id = i;                                     
>       
> 151         lapic->apic_id = LAPIC_ID(i);                        
> 
> Which sets this up.

Right, all of the LAPIC information is provided to the guest OS via
the MADT. I believe what I'm observing is that Linux and Windows use
the order of entries to enumerate processors in the system.

What we typically see on bare metal Intel systems is something like
this (example system has 16 cores with HT):

All of the "cores"...
[    0.000000] ACPI: Local APIC address 0xfee00000
[    0.000000] ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x02] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x02] lapic_id[0x04] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x03] lapic_id[0x06] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x04] lapic_id[0x08] enabled)
...

Followed by all of the "threads"...
[    0.000000] ACPI: LAPIC (acpi_id[0x10] lapic_id[0x01] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x11] lapic_id[0x03] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x12] lapic_id[0x05] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x13] lapic_id[0x07] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x14] lapic_id[0x09] enabled)

Since Xen hard codes the LAPIC ID (and x2APIC ID) to 0, 2, 4, 6, 8,
etc. (vCPUID * 2), everything looks like a core.

> So .. assuming this was thought out, why are we starting on vCPUs
> that don't match to this? That seems like a bug? (Note, this is 
> with maxvcpus=32, vcpus=1 and the starting of a VCPU1 actually
> ended up starting at VCPU4?!).

I'm lost. What?

> I think all of this can be sorted out if the hvmloader sets the
> LAPIC CPU == VCPU ID properly.

No, that's not the right answer. Or, at least, not completely. Right
now Xen provides the same ID for both the LAPIC and x2APIC. In order
for cpu topology discovery to work, the x2APIC needs to follow a
particular structure. See the Intel whitepaper on processor topology
enumeration:
  
http://software.intel.com/sites/default/files/m/d/4/1/d/8/Kuo_CpuTopology_rc1.rh1.final.pdf

> So perhaps a better question is - why is it not setup properly
> nowadays? If the formal is baked in for the PVHVM guests, somewhere
> the formula is not being evaluated properly?

The "LAPIC ID = 2 * vCPUID" formula is not baked into any OS that I
know of, and it shouldn't be. It should all be discovered via
firmware/BIOS tables. The enumeration order in the tables should,
under best practices, match the logical processor ID assignment in the
OS.

> The new hypercall to figure this out could be used, but that wouldn't
> explain why we are failing to start on the correct VCPU?

I didn't follow the jump here. Can you provide an example?

--msw

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.