[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [BUG] x2apic broken with current AMD hardware


  • To: Elliott Mitchell <ehem+xen@xxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Mon, 20 Mar 2023 09:28:20 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=3krFKUF7EmSxiLRn955kU41PB/CuJ919vaCLLIEw+N4=; b=XRH9DiotpZ4MmkSfYWW0PU9LLgC7ddVCCAXH5kAOq8UtPZkQtXzncXNuxqv1I2ps4E1TPIygyDPhJnWCkiQo64iVgM/rib4LOalooZYK7XhsVYLzawKeerwa0/2UG9aat5P8jSi/PPnpaRLMLlcBC/6XHdy9Ea1Es28iS4rlR7e7GOU+1rEQuy6NEfobD1H5mAKBjc3KlO8j2Fc6zAm6tH1da02S4GoP0KeI/3vZTc+hexCAs+6sN2YU6TvaJ5NXLSLM90mOSrQ8rzF6z5ZXAWopfpzlDVXNA+aUS/bOxovtKClPcl5L6AQeIhW8x/956pQwfUcPOWe62G0TGORiUA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Ell8f6XvNKCeC5cSWElrWkZePKbPRnql9fZbKwrnKRM6pctwT/UWvjWZPbV4eCAihUVir/KLFnQ4xyhGp8K6veP8b0pzJ1sjUXLBF8x/cSxss79c0+2639FRq63X9r5iOJ/2C8T4tAc1Pe2e+UarfLBV8RlQ/2ueC9JsCX0agWlVhSzKKjkcFJRfHTKuB5uRq0PdQlwppHgTnLFo3TZfbcL5meICuM2C5b9kx4aqd/w4A55YSkP1h7AVl+Ib/BNtcaNKICb8PTXgcPr1nuzkNkfrTFM1URwnWwP+ij8gM1DsB688aq2ftGbK1ub/uAZK8kCAms6zi+GORAJI3zymtA==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx
  • Delivery-date: Mon, 20 Mar 2023 08:28:40 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 20.03.2023 09:14, Jan Beulich wrote:
> On 17.03.2023 18:26, Elliott Mitchell wrote:
>> On Fri, Mar 17, 2023 at 09:22:09AM +0100, Jan Beulich wrote:
>>> On 16.03.2023 23:03, Elliott Mitchell wrote:
>>>> On Mon, Mar 13, 2023 at 08:01:02AM +0100, Jan Beulich wrote:
>>>>> On 11.03.2023 01:09, Elliott Mitchell wrote:
>>>>>> On Thu, Mar 09, 2023 at 10:03:23AM +0100, Jan Beulich wrote:
>>>>>>>
>>>>>>> In any event you will want to collect a serial log at maximum verbosity.
>>>>>>> It would also be of interest to know whether turning off the IOMMU 
>>>>>>> avoids
>>>>>>> the issue as well (on the assumption that your system has less than 255
>>>>>>> CPUs).
>>>>>>
>>>>>> I think I might have figured out the situation in a different fashion.
>>>>>>
>>>>>> I was taking a look at the BIOS manual for this motherboard and noticed
>>>>>> a mention of a "Local APIC Mode" setting.  Four values are listed
>>>>>> "Compatibility", "xAPIC", "x2APIC", and "Auto".
>>>>>>
>>>>>> That is the sort of setting I likely left at "Auto" and that may well
>>>>>> result in x2 functionality being disabled.  Perhaps the x2APIC
>>>>>> functionality on AMD is detecting whether the hardware is present, and
>>>>>> failing to test whether it has been enabled?  (could be useful to output
>>>>>> a message suggesting enabling the hardware feature)
>>>>>
>>>>> Can we please move to a little more technical terms here? What is 
>>>>> "present"
>>>>> and "enabled" in your view? I don't suppose you mean the CPUID bit (which
>>>>> we check) and the x2APIC-mode-enable one (which we drive as needed). It's
>>>>> also left unclear what the four modes of BIOS operation evaluate to. Even
>>>>> if we knew that, overriding e.g. "Compatibility" (which likely means some
>>>>> form of "disabled" / "hidden") isn't normally an appropriate thing to do.
>>>>> In "Auto" mode Xen likely should work - the only way I could interpret the
>>>>> the other modes are "xAPIC" meaning no x2APIC ACPI tables entries (and
>>>>> presumably the CPUID bit also masked), "x2APIC" meaning x2APIC mode pre-
>>>>> enabled by firmware, and "Auto" leaving it to the OS to select. Yet that's
>>>>> speculation on my part ...
>>>>
>>>> I provided the information I had discovered.  There is a setting for this
>>>> motherboard (likely present on some similar motherboards) which /may/
>>>> effect the issue.  I doubt I've tried "compatibility", but none of the
>>>> values I've tried have gotten the system to boot without "x2apic=false"
>>>> on Xen's command-line.
>>>>
>>>> When setting to "x2APIC" just after "(XEN) AMD-Vi: IOMMU Extended 
>>>> Features:"
>>>> I see the line "(XEN) - x2APIC".  Later is the line
>>>> "(XEN) x2APIC mode is already enabled by BIOS."  I'll guess "Auto"
>>>> leaves the x2APIC turned off since neither line is present.
>>>
>>> When "(XEN) - x2APIC" is absent the IOMMU can't be switched into x2APIC
>>> mode. Are you sure that's the case when using "Auto"?
>>
>> grep -eAPIC\ driver -e-\ x2APIC:
>>
>> "Auto":
>> (XEN) Using APIC driver default
>> (XEN) Overriding APIC driver with bigsmp
>> (XEN) Switched to APIC driver x2apic_cluster
>>
>> "x2APIC":
>> (XEN) Using APIC driver x2apic_cluster
>> (XEN) - x2APIC
>>
>> Yes, I'm sure.
> 
> Okay, this then means we're running in a mode we don't mean to run
> in: When the IOMMU claims to not support x2APIC mode (which is odd in
> the first place when at the same time the CPU reports x2APIC mode as
> supported), amd_iommu_prepare() is intended to switch interrupt
> remapping mode to "restricted" (which in turn would force x2APIC mode
> to "physical", not "clustered"). I notice though that there are a
> number of error paths in the function which bypass this setting. Could
> you add a couple of printk()s to understand which path is taken (each
> time; the function can be called more than once)?

I think I've spotted at least one issue. Could you give the patch below
a try please? (Patch is fine for master and 4.17 but would need context
adjustment for 4.16.)

Jan

AMD/IOMMU: without XT, x2APIC needs to be forced into physical mode

An earlier change with the same title (commit 1ba66a870eba) altered only
the path where x2apic_phys was already set to false (perhaps from the
command line). The same of course needs applying when the variable
wasn't modified yet from its initial value.

Reported-by: Elliott Mitchell <ehem+xen@xxxxxxx>
Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>

--- unstable.orig/xen/arch/x86/genapic/x2apic.c
+++ unstable/xen/arch/x86/genapic/x2apic.c
@@ -236,11 +236,11 @@ const struct genapic *__init apic_x2apic
     if ( x2apic_phys < 0 )
     {
         /*
-         * Force physical mode if there's no interrupt remapping support: The
-         * ID in clustered mode requires a 32 bit destination field due to
+         * Force physical mode if there's no (full) interrupt remapping 
support:
+         * The ID in clustered mode requires a 32 bit destination field due to
          * the usage of the high 16 bits to hold the cluster ID.
          */
-        x2apic_phys = !iommu_intremap ||
+        x2apic_phys = iommu_intremap != iommu_intremap_full ||
                       (acpi_gbl_FADT.flags & ACPI_FADT_APIC_PHYSICAL) ||
                       (IS_ENABLED(CONFIG_X2APIC_PHYSICAL) &&
                        !(acpi_gbl_FADT.flags & ACPI_FADT_APIC_CLUSTER));





 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.