[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] x86/x2apic: introduce a mixed physical/cluster mode
Content-D isposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <81f6bbd5-0487-461a-af1a-dbb6ead47cab@xxxxxxxxxx> On 2023-11-18 11:11, Andrew Cooper wrote: > On 18/11/2023 3:04 am, Elliott Mitchell wrote: > > On Fri, Nov 17, 2023 at 11:12:37AM +0100, Neowutran wrote: > >> On 2023-11-07 11:11, Elliott Mitchell wrote: > >>> On Mon, Oct 30, 2023 at 04:27:22PM +01 > >>>> On Mon, Oct 30, 2023 at 07:50:27AM -0700, Elliott Mitchell wrote: > >>>>> On Tue, Oct 24, 2023 at 03:51:50PM +0200, Roger Pau Monne wrote: > >>>>>> diff --git a/xen/arch/x86/genapic/x2apic.c > >>>>>> b/xen/arch/x86/genapic/x2apic.c > >>>>>> index 707deef98c27..15632cc7332e 100644 > >>>>>> --- a/xen/arch/x86/genapic/x2apic.c > >>>>>> +++ b/xen/arch/x86/genapic/x2apic.c > >>>>>> @@ -220,38 +239,56 @@ static struct notifier_block x2apic_cpu_nfb = { > >>>>>> static int8_t __initdata x2apic_phys = -1; > >>>>>> boolean_param("x2apic_phys", x2apic_phys); > >>>>>> > >>>>>> +enum { > >>>>>> + unset, physical, cluster, mixed > >>>>>> +} static __initdata x2 apic_mode = unset; > >>>>>> + > >>>>>> +static int __init parse_x2apic_mode(const char *s) > >>>>>> +{ > >>>>>> + if ( !cmdline_strcmp(s, "physical") ) > >>>>>> + x2apic_mode = physical; > >>>>>> + else if ( !cmdline_strcmp(s, "cluster") ) > >>>>>> + x2apic_mode = cluster; > >>>>>> + else if ( !cmdline_strcmp(s, "mixed") ) > >>>>>> + > >>>>>> + else > >>>>>> + return EINVAL; > >>>>>> + > >>>>>> + return 0; > >>>>>> +} > >>>>>> +custom_param("x2apic-mode", parse_x2apic_mode); > >>>>>> + > >>>>>> const struct genapic *__init apic_x2apic_probe(void) > >>>>>> { > >>>>>> - if ( x2apic_phys < 0 ) > >>>>>> + /* x2apic-mode option has preference over x2apic_phys. */ > >>>>>> + if ( x2apic_phys >= 0 && x2apic_mode == unset ) > >>>>>> + x2apic_mode = x2apic_phys ? physical : cluster; > >>>>>> + > >>>>>> + if ( x2apic_mode == unset ) > >>>>>> { > >>>>>> - /* > >>>>>> - * Force physical mode if there's no (full) interrupt > >>>>>> remapping support: > >>>>>> - * The ID in clustered mode requires a 32 bit destination > >>>>>> field due to > >>>>>> - * the usage of the high 16 bits to hold the cluster ID. > >>>>>> - */ > >>>>>> - x2apic_phys = iommu_intremap != iommu_intremap_full || > >>>>>> - (acpi_gbl_FADT.flags & ACPI_FADT_APIC_PHYSICAL) > >>>>>> || > >>>>>> - > >>>>>> - } > >>>>>> - else if ( !x2apic_phys ) > >>>>>> - switch ( iommu_intremap ) > >>>>>> + if ( acpi_gbl_FADT.flags & ACPI_FADT_APIC_PHYSICAL ) > >>>>>> { > >>>>> Could this explain the issues with recent AMD processors/motherboards? > >>>>> > >>>>> Mainly the firmware had been setting this flag, but Xen was previously > >>>>> ignoring it? > >>>> No, not unless you pass {no-}x2apic_phys={false,0} on the Xen command > >>>> line to force logical (clustered) destination mode. > >>>> > >>>>> As such Xen had been attempting to use cluster mode on an > >>>>> x2APIC where that mode was broken for physical interrupts? > >>>> No , not realy, x2apic_phys was already forced to true if > >>>> acpi_gbl_FADT.flags & ACPI_FADT_APIC_PHYSICAL is set on the FADT (I > >>>> just delete that line in this same chunk and move it here). > >>> Okay, that was from a quick look at the patch. Given the symptoms and > >>> workaround with recent AMD motherboards, this looked > >>> > >>> In that case it might be a bug in what AMD is providing to motherboard > >>> manufacturers. Mainly this bit MUST be set, but AMD's implementation > >>> leaves it unset. > >>> > >>> Could also be if the setup is done correctly the bit can be cleared, but > >>> multiple motherboard manufacturers are breaking this. Perhaps the steps > >>> are fragile and AMD needed to provide better guidance. > >>> > >>> > >>> Neowutran, are you still setup to and interested in doing > >>> experimentation/testing with Xen on your AMD computer? Would you be up > >>> for trying the patch here: > >>> > >>> https://lore.kernel.org/xen-devel/20231106142739.19650-1-roger.pau@xxxxxxxxxx/raw > >>> > >>> I have a suspicion this *might* fix the x2APIC issue everyone has been > >>> seeing. > >>> > >>> How plausible would it be to release this as a bugfix/workaround on 4.17? > >>> I'm expecting a "no", but figured I should ask given how widespread the > >>> issue is. > >> I just applied the patch on my setup ( > >> https://lore.kernel.org/xen-devel/20231106142739.19650-1-roger.pau@xxxxxxxxxx/raw > >> ) > >> It seems to fix the x2APIC issue I was having. > >> > >> I only did some quick tests: > >> > >> I tried all the differents values in my bios for the X2APIC settings. > >> Now the system successfully boot in all the cases, without needing > >> manual override of the x2apic_phys/x2apic_mode parameter in boot > >> commandline . > > In light of this issue effecting a large number of people with recent > > hardware, I formally request the patch > > "x86/x2apic: introduce a mixed physical/cluster mode" be considered for > > backport release on the 4.17 and 4.18 branches. > > > > (I'm unsure whether it is realistic for a 4.17 update, but I figure I > > should ask) > > This is an unreasonable ask. > > I believe you when you say there is (or at least was) an x2apic bug (or > bugs), but not once did you provide the logging requested, nor engage > usefully with us in debugging. > > And despite this, we (Roger, Jan and myself) found, fixed and backported > 3 x2apic bugs. > > Now you come along guessing alone at x2apic in a patch name that it > fixes your problem, on a patch which is not a bugfix - it's a > performance optimisation. ------ > Neowutran, thankyou for looking into the patch, but I'm afraid that > doesn't confirm that this patch fixed an issue either. If it really did > make a difference, then you'll see a difference in behaviour using each > of the 3 new x2apic-mode= options. > > Please could you take your single up-to-date build of Xen, put the BIOS > settings back to whatever was causing you problems originally, and > describe what happens when booting each of > x2apic-mode={physical,cluster,mixed}? Hi, I did some more tests and research, indeed this patch improved/solved my specific case. Starting point: I am using Xen version 4.17.2 (exactly this source https://github.com/QubesOS/qubes-vmm-xen). In the bios (a Asus motherboard), I configured the "local apic" parameter to "X2APIC". For Xen, I did not set the parameter "x2apic-mode" nor the parameter "x2apic_phys". Case 1: I tryied to boot just like that, result: system is unusuably slow Case 2: Then, I applied a backport of the patch https://lore.kernel.org/xen-devel/20231106142739.19650-1-roger.pau@xxxxxxxxxx/raw to the original Xen version of QubesOS and I recompiled. (https://github.com/neowutran/qubes-vmm-xen/blob/x2apic3/X2APIC.patch) Result: it work, the system is usable. Case 3: Then, I applied the patch https://github.com/xen-project/xen/commit/26a449ce32cef33f2cb50602be19fcc0c4223ba9 to the original Xen version of QubesOS and I recompiled. (https://github.com/neowutran/qubes-vmm-xen/blob/x2apic4/X2APIC.patch) Result: system is unusuably slow. In "Case 2", the value returned by the function "apic_x2apic_probe" is "&apic_x2apic_mixed". In "Case 3", the value returned by the function "apic_x2apic_probe" is "&apic_x2apic_cluster". ------------------- If you want / need, details for the function "apic_x2apic_probe": Known "input" value: "CONFIG_X2APIC_PHYSICAL" is not defined "iommu_intremap == iommu_intremap_off" = false "acpi_gbl_FADT.flags & ACPI_FADT_APIC_PHYSICAL" -> 0 "acpi_gbl_FADT.flags" = 247205 (in decimal) "CONFIG_X2APIC_PHYSICAL" is not defined "CONFIG_X2APIC_MIXED" is defined, because it is the default choice "x2apic_mode" = 0 "x2apic_phys" = -1 Trace log (I did some call "printk" to trace what was going on) Case 2: (XEN) NEOWUTRAN: X2APIC_MODE: 0 (XEN) NEOWUTRAN: X2APIC_PHYS: -1 (XEN) NEOWUTRAN: acpi_gbl_FADT.flags: 247205 (XEN) NEOWUTRAN IOMMU_INTREMAP: different (XEN) Neowutran: PASSE 2 (XEN) Neowutran: PASSE 4 (XEN) NEOWUTRAN: X2APIC_MODE: 3 (XEN) Neowutran: PASSE 7 (XEN) NEOWUTRAN: X2APIC_MODE: 3 (XEN) NEOWUTRAN: X2APIC_PHYS: -1 (XEN) NEOWUTRAN: acpi_gbl_FADT.flags: 247205 (XEN) NEOWUTRAN IOMMU_INTREMAP: different Case 3: (XEN) NEOWUTRAN2: X2APIC_PHYS: -1 (XEN) NEOWUTRAN2: acpi_gbl_FADT.flags: 247205 (XEN) NEOWUTRAN2 IOMMU_INTREMAP: different (XEN) Neowutran2: Passe 1 (XEN) NEOWUTRAN2: X2APIC_PHYS: 0 (XEN) Neowutran2: Passe 6 (XEN) Neowutran2: Passe 7 (XEN) NEOWUTRAN2: X2APIC_PHYS: 0 (XEN) NEOWUTRAN2: acpi_gbl_FADT.flags: 247205 (XEN) NEOWUTRAN2 IOMMU_INTREMAP: different (XEN) Neowutran2: Passe 2 (XEN) Neowutran2: Passe 4 (XEN) Neowutran2: Passe 7 If you require the full logs, I could publish the full logs somewhere. ---------------------- ( However I do not understand if the root issue is a buggy motherboard, a bug in xen, or if the parameter "X2APIC_PHYSICAL" should have been set by the QubesOS project, or something else) > Thankyou, > > ~Andrew Thanks you, Neowutran
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |