[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 1/1] x86: centralize default APIC id definition


  • To: Juergen Gross <jgross@xxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Alex Olson <this.is.a0lson@xxxxxxxxx>
  • From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Date: Fri, 1 Oct 2021 18:38:12 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=9hnbtrZYVEo5wCc6TgGO5kTiiF5U8l2lVknkRriU0NU=; b=WQ19KQCDtscx1jVC4N8DVq6dKMQgHolmmlo+t/GOyb1NjxW19FEUQ42xZ9G4Qw383TxlGW1nl/8NA1JpdHnjVoPqqqz5ZnaEdBeR5f5lUX21w/qZJ8EZHEF4Euvg2UGPxizVDRss7peDuMgWM7IojZvrA3xK2jkBnn3AytR9Sv2AlLuG3BohCc9LP3k6X1tBju1T5Q7StH1mz+FxPhul/3nomz9kXb1yT22+FZzj6XB7ULZF6w60/W7xCr0a9e69uvEPgYazY4h/ghQowcDuuLYYnTpAKRg7mqc0KxtmJQtbunfv/8Ks8MtZTWELiI6jTJCqqiRNSo/2DyCUQFIiSw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=IfxEgeSjoEGRQ5Hb/e1XRlvbiBldgTlyCpQayR2XFP3PZs5iJ45488lemR+zyfQVoR3HtwrJPXHdW1HHOSI2JV3IA113gQkaSMJmAHvf8KKOdi6HSuI0JNeVjJ6ll8YOghgTfDI819pwqbkKxKPjVfOVVwGUgLlWV941FX56B5Jz3GnUb6RHwTHaL8iQC+rG2ZU+CARaxfQ13EFkt3GXbIwf1vJL1Afn/ipAzgbTr+YOd7s8G0Mg3sxDBxnrVjhIbOtvGXEdtVPLgcbu3etTivS44ctkSMGlP4blTkWYDiLJIPiMhroAMMT7Jd6yS4C0Ry0/9xIsHLZbXdOOe8Z8rQ==
  • Authentication-results: esa1.hc3370-68.iphmx.com; dkim=pass (signature verified) header.i=@citrix.onmicrosoft.com
  • Cc: Roger Pau Monné <roger.pau@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Ian Jackson <iwj@xxxxxxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Anthony PERARD <anthony.perard@xxxxxxxxxx>, Alex Olson <alex.olson@xxxxxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Fri, 01 Oct 2021 17:38:48 +0000
  • Ironport-data: A9a23:1OBGwqINslfBMAaqFE+RtJMlxSXFcZb7ZxGr2PjKsXjdYENSgmZRz zRMCzvQOa6OYGWhLop0Odu0oR5Sv8CHmN8yG1FlqX01Q3x08seUXt7xwmUcns+xwm8vaGo9s q3yv/GZdJhcokcxIn5BC5C5xZVG/fjgqoHUVaiUZ0ideSc+EH140Eo7wrZi6mJVqYPR7z2l6 IuaT/L3YDdJ6xYsWo7Dw/vewP/HlK2aVAIw5jTSV9gS1LPtvyB94KYkDbOwNxPFrrx8RYZWc QphIIaRpQs19z91Yj+sfy2SnkciGtY+NiDW4pZatjTLbrGvaUXe345iXMfwZ3u7hB2lwcBxz fdPn6eVRC4iAfHUts8GcD9XRnQW0a1uoNcrIFC6uM2XiUbHb2Ht07NlC0Re0Y8wo7gtRzsUr LpBdW5LPkvra+GemdpXTsFFgMg5IdatF4QYonx6lhnSDOo8QICFSKLPjTNd9Gps254XRaaFD yYfQTdjXh2ZRiYWAxQGMqwCu+6OoSfdcDIN/Tp5ooJoujOOnWSdyoPFLNrUYZqLXoNcxkKDo WTu8GHwAxVcP9uaoRKa9lq8i+mJmjn0MKoQG6e/7eVCm0CIyyoYDxh+fVmxrOS9i0W+c8lCM EFS8S0rxYAz606DXtT7Rwe/onOPolgbQdU4O+Q+5RHLza7P5ACxHXQNVDpIYpons6ceVTEsk 1OEgd7tLThuq6GOD2KQ8K+OqjG/MjRTKnUNDQcGUA8E7t/LsIw1yBXVQb5LC7Wph9f4HTXxx TGiryUkgbgXy8kR2M2GEUvv2mz24MKTF0hsu1uRDjnNAh5FiJCNS9K46V3579x7CYuwRWbdp FYvgOu447VbZX2SrxClTOIIFbCvwv+KNjzAnFJid6UcGySRF22LJ94Jv2AnTKt9GoNUIWa2P B+7VRZ5vsc7AZe8UUNgj2td4ewEyrL8XfDsS/zZdNZHZpUZmOSvp3o1PR/4M4wAiiERfUAD1 XWzLZvE4ZUyU/0PIN+KqwE1iuNDKscWnz27eHwD5077uYdynVbMIVv/DHOAb/oi8ISPqxjP/ tBUOqOikksEC7WkPXKJqNNDfDjmyETX47it96S7kcbZfGJb9JwJUaeNkdvNhaQ/90iqqgs41 i7kARIJoLYOrXbGNR+LehhehEDHBv5CQYYAFXV0Zz6AgiF7Ca72tft3X8ZnLNEPqb04pdYpH qZtRil1KqkWItgx029GNseVQU0LXEnDuD9iyAL/O2VgI848GVaZkjImFyO2nBQz4uOMnZJWi 5Wr1x/BQIpFQAJnDc3Mb+mowU/3tn8Y8N+elWOSSjWKUEmzooVsNQLrifo7f5MFJRnZn2PI3 AeKGxYI4+LKptZtotXOgKmFqaavEvd/QRUGTzWKs+7uOHmI5HenzK9BTP2MIWLXWlTr9fjwf u5S1fz9bqEKxQ4Yr4pmHr935qsi/N+z9aRCxwFpESyTPVSmA79tOFec2sxLuvEfz7NVo1LuC EmO5sNbKfOCP8a8SAwdIw8sb+Ki0/AIm2aNsaRpcRuivCIupeiJS0RfORWImRdxFrotPdN32 /olte4X9xe71kggPOGZg30G7G+LNHEBDfkq78lIHI/xhwM34VheepiAWDTu6ZSCZtgQYEknJ jiY2PjLi7hGnxeQdnMyET7G3PZHhIRIsxdPlQdQK1OMk9vDp/k2wBwOrmhnElULlk1Kg7BpJ 2xmF0xpPqHfrT5nif9KU32oBwwcVgaS/Vb8ygdRmWDUJ6Vyurch8IHp1T6xwX0k
  • Ironport-hdrordr: A9a23:ipD1gaw9yESn5JcOUOwEKrPx/OskLtp133Aq2lEZdPULSKalfp GV98jziyWdtN9xYgBrpTnkAsS9qBznhPtICOUqU4tKGTOW3ldAT7sSoLcKoQeQfxEWn9Q1uc hdmupFebrN5DNB7foSlTPIcerIt+P3k5xA692+855Fd3ATV4hQqyNCTiqLGEx/QwdLQbAjEo CH28ZBrz28PVwKc8WSHBA+Lqf+juyOsKijTQ8NBhYh5gXLpyiv8qTGHx+R2Qpbey9TwI0l7X POn2XCl+ueWrCAu17hPl3ontdrcejau5l+7fm3+4gowm+FsHfSWG0uYczGgNl/mpDX1L9jqq i3n/5nBbU+15qZRBDJnfO5szOQrwoG+jvsz0SVjmDkptG8TDUmC9BZjYYcaRfB7VE81esMmJ 6j8ljpwaa/Nymw1RgVJuK4JS1Chw6xuz4vgOQTh3tQXc8Xb6JQt5UW+AdPHJIJDEvBmfca+M EHNrCj2B5+GWnqE0wxflMftOBEe05DVStubnJyyvB94gIm6UyRlXFotfD3tk1wh64Adw==
  • Ironport-sdr: m60S6CW2clrddJUFrbnbgZgVBN0mzt3rEnmuBsgVTR3whthKBiq2TcKNB4Nj5dtR+Vi9euQfu7 rFEzhGLG5T1rIx7tArDqhuMP4QL1DOroff85ov/Rz4w4yj2Rp8hF5sAeJ5RCy9TJyBs7Ya138N RuE13kEOhksBj00qUCz9RIbTXesy0WncFAt7sY8Y/yZ+H+swgpAOImhEUsIDiH7bIBpAP4D6rS oGO6slRYMOTHqqAbC2xHC+NxDc42cnYiQK6vdgm+v3yq22NIMvx04tvxst0fMpupStVgEkfE1n O+9kuvsXIuoqoZkxfTCk/BH7
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 01/10/2021 16:08, Juergen Gross wrote:
> On 01.10.21 16:29, Andrew Cooper wrote:
>> On 01/10/2021 15:19, Jan Beulich wrote:
>>> On 24.09.2021 21:39, Alex Olson wrote:
>>>> Inspired by an earlier attempt by Chao Gao <chao.gao@xxxxxxxxx>,
>>>> this revision aims to put the hypervisor in control of x86 APIC
>>>> identifier
>>>> definition instead of hard-coding a formula in multiple places
>>>> (libxl, hvmloader, hypervisor).
>>>>
>>>> This is intended as a first step toward exposing/altering CPU topology
>>>> seen by guests.
>>>>
>>>> Changes:
>>>>
>>>> - Add field to vlapic for holding default ID (on reset)
>>>>
>>>> - add HVMOP_get_vcpu_topology_id hypercall so libxl (for PVH domains)
>>>>    can access APIC ids needed for ACPI table definition prior to
>>>> domain start.
>>>>
>>>> - For HVM guests, hvmloader now also uses the same hypercall.
>>>>
>>>> - Make CPUID code use vlapic ID instead of hard-coded formula
>>>>    for runtime reporting to guests
>>> I'm afraid a primary question from back at the time remains: How is
>>> migration of a guest from an old hypervisor to one with this change
>>> in place going to work?
>>
>> I'm afraid its not.
>>
>> Fixing this is incredibly complicated.  I have a vague plan, but it
>> needs building on the still-pending libxl cpuid work of Rogers.
>>
>> Both the toolstack and Xen need to learn about how to describe topology
>> correctly (and I'm afraid this patch isn't correct even for a number of
>> the simple cases), and know about "every VM booted up until this point
>> in time" being wrong.
>
> What about:
>
> - adding APIC-Id to the migration stream
> - adding an optional translation layer for guest APIC-Id to the
>   hypervisor
> - adding the functionality to set a specific APIC-Id for a vcpu
>   (will use the translation layer if not the same as preferred
>   by the hypervisor)

The vCPU APIC IDs are already in the migration stream.  They're just too
late in the stream for any easy fixup.

A second problem we have is that (x)APIC IDs are writeable under Xen,
but writeability of the register is a model specific trait to being
with.  Furthermore, you get potentially differing behaviour depending on
whether APICV is enabled or not.  I plan to fix this by simply outlawing
it - OSes don't renumber the APICs in the first place (just like they
don't move the MMIO window), and all they'll do is break things.

The main topology problem is that we have no interlink between the
CPUID-described data, and the default APIC IDs chosen.  There are 5
different algorithms to choose from (vendor and CPU dependent) and we
implement 0 of them.

The xl config file needs more than just cpuid= data to express the
topology correctly, because for non-power-of-two systems, there need to
be gaps in the APIC_ID space, and this needs communicating to Xen too. 
(For old AMD, we also need a slide, but we can probably leave that as an
exercise to anyone who cares, which seems to be noone so far).

Either way, when the toolstack can reason about topologies correctly, we
can extend the xl json in the stream.  The absense of the marker serves
as "This VM didn't boot with sane topology", which we can use the
fallback logic (see libxl__cpuid_legacy() and the soon to exist
companion in libxc) to re-synthesize the old pattern for when data is
missing in the stream.

Any change to the topology algorithms before the toolstack is capable of
doing everything else leaves us with two[1] different kinds of VMs that
we can't tell apart, and cannot cope with in a compatible way.

~Andrew

[1] Actually 3.  XenServer still has a revert of ca2eee92df44 in the
patchqueue because that broke VMs which migrated across the point.  As
it's from 2008, pre-and-post VMs aren't something we need to care about,
because anyone still running Xen 3.3 has far bigger problems than this
to worry about.




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.