[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 1/1] x86: centralize default APIC id definition



I hadn't seriously considered the scenario of VM migration from older
Xen versions,  but If I understand Andrew's last statement correctly,
it sounds like my patch will break things in new ways that cannot be
fixed at this time. If that is the case, I'll abandon my efforts with
this patch. 

Thank you for the reviews and detailed feedback,

-Alex

On Fri, 2021-10-01 at 18:38 +0100, Andrew Cooper wrote:
> On 01/10/2021 16:08, Juergen Gross wrote:
> > On 01.10.21 16:29, Andrew Cooper wrote:
> > > On 01/10/2021 15:19, Jan Beulich wrote:
> > > > On 24.09.2021 21:39, Alex Olson wrote:
> > > > > Inspired by an earlier attempt by Chao Gao <
> > > > > chao.gao@xxxxxxxxx>,
> > > > > this revision aims to put the hypervisor in control of x86
> > > > > APIC
> > > > > identifier
> > > > > definition instead of hard-coding a formula in multiple
> > > > > places
> > > > > (libxl, hvmloader, hypervisor).
> > > > > 
> > > > > This is intended as a first step toward exposing/altering CPU
> > > > > topology
> > > > > seen by guests.
> > > > > 
> > > > > Changes:
> > > > > 
> > > > > - Add field to vlapic for holding default ID (on reset)
> > > > > 
> > > > > - add HVMOP_get_vcpu_topology_id hypercall so libxl (for PVH
> > > > > domains)
> > > > >    can access APIC ids needed for ACPI table definition prior
> > > > > to
> > > > > domain start.
> > > > > 
> > > > > - For HVM guests, hvmloader now also uses the same hypercall.
> > > > > 
> > > > > - Make CPUID code use vlapic ID instead of hard-coded formula
> > > > >    for runtime reporting to guests
> > > > I'm afraid a primary question from back at the time remains:
> > > > How is
> > > > migration of a guest from an old hypervisor to one with this
> > > > change
> > > > in place going to work?
> > > 
> > > I'm afraid its not.
> > > 
> > > Fixing this is incredibly complicated.  I have a vague plan, but
> > > it
> > > needs building on the still-pending libxl cpuid work of Rogers.
> > > 
> > > Both the toolstack and Xen need to learn about how to describe
> > > topology
> > > correctly (and I'm afraid this patch isn't correct even for a
> > > number of
> > > the simple cases), and know about "every VM booted up until this
> > > point
> > > in time" being wrong.
> > 
> > What about:
> > 
> > - adding APIC-Id to the migration stream
> > - adding an optional translation layer for guest APIC-Id to the
> >   hypervisor
> > - adding the functionality to set a specific APIC-Id for a vcpu
> >   (will use the translation layer if not the same as preferred
> >   by the hypervisor)
> 
> The vCPU APIC IDs are already in the migration stream.  They're just
> too
> late in the stream for any easy fixup.
> 
> A second problem we have is that (x)APIC IDs are writeable under Xen,
> but writeability of the register is a model specific trait to being
> with.  Furthermore, you get potentially differing behaviour depending
> on
> whether APICV is enabled or not.  I plan to fix this by simply
> outlawing
> it - OSes don't renumber the APICs in the first place (just like they
> don't move the MMIO window), and all they'll do is break things.
> 
> The main topology problem is that we have no interlink between the
> CPUID-described data, and the default APIC IDs chosen.  There are 5
> different algorithms to choose from (vendor and CPU dependent) and we
> implement 0 of them.
> 
> The xl config file needs more than just cpuid= data to express the
> topology correctly, because for non-power-of-two systems, there need
> to
> be gaps in the APIC_ID space, and this needs communicating to Xen
> too. 
> (For old AMD, we also need a slide, but we can probably leave that as
> an
> exercise to anyone who cares, which seems to be noone so far).
> 
> Either way, when the toolstack can reason about topologies correctly,
> we
> can extend the xl json in the stream.  The absense of the marker
> serves
> as "This VM didn't boot with sane topology", which we can use the
> fallback logic (see libxl__cpuid_legacy() and the soon to exist
> companion in libxc) to re-synthesize the old pattern for when data is
> missing in the stream.
> 
> Any change to the topology algorithms before the toolstack is capable
> of
> doing everything else leaves us with two[1] different kinds of VMs
> that
> we can't tell apart, and cannot cope with in a compatible way.
> 
> ~Andrew
> 
> [1] Actually 3.  XenServer still has a revert of ca2eee92df44 in the
> patchqueue because that broke VMs which migrated across the point. 
> As
> it's from 2008, pre-and-post VMs aren't something we need to care
> about,
> because anyone still running Xen 3.3 has far bigger problems than
> this
> to worry about.
> 




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.