Xen project Mailing List

Re: [BUG]i2c_hid_acpi broken with 4.17.2 on Framework Laptop 13 AMD

To: Jan Beulich <jbeulich@xxxxxxxx>, Sébastien Chaumat <euidzero@xxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

From: David Woodhouse <dwmw2@xxxxxxxxxxxxx>

Date: Thu, 10 Oct 2024 15:45:14 +0100

Cc: Mario Limonciello <mario.limonciello@xxxxxxx>, Juergen Gross <jgross@xxxxxxxx>, Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx

Delivery-date: Thu, 10 Oct 2024 14:45:24 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Thu, 2024-03-07 at 09:39 +0100, Jan Beulich wrote: > On 06.03.2024 18:28, Sébastien Chaumat wrote: > > Reasoning backward (using a kernel without the pinctrl_amd driver to > > > ensure xen only is at stake) : > > > checking the diff in IOAPIC between bare metal and xen (IRQ7 is on > > > pin07 on APIC ) > > > > > > using kernel argument : apic=debug > > > > > > bare metal : > > > [ 0.715330] fedora kernel: ... APIC VERSION: 81050010 > > > ... > > > [ 0.715433] fedora kernel: pin07, disabled, edge , high, V(00), > > > IRR(0), S(0), physical, D(0000), M(0) > > > > > > xen : > > > [ 2.249582] fedora kernel: ... APIC VERSION: 00000014 > > > ... > > > [ 2.249730] fedora kernel: pin07, disabled, level, low , V(60), > > > IRR(0), S(0), physical, D(0000), M(0) > > > > > > So the APIC table is not the same. > > > > > > As strange as it looks the (IOAPIC 0) pin07 is correctly described by the > > > APIC in xen but yet differently than in baremetal. > > > But the APIC message comes long after the > > > [ 1.833145] fedora kernel: xen: registering gsi 7 triggering 0 polarity > > > 1 > > > > > > so I wonder if the APIC pin07 info had any influence. > > > > > > Finally found the fix : adding ioapic_ack=new to xen boot parameters. > > Not only the trackpad is now working but also the ACPI Embedded Controller > > which is completely disabled. > > Hmm, interesting. From someone else's laptop many years ago I had actually > an indication in the opposite direction: That didn't work because of our > defaulting to new (no directed EOI in sight yet back at that time). I > wonder if overriding the ack method isn't actually just papering over the > underlying actual issue here, whatever that is. IOW with the edge vs level > mismatch addressed I'd hope the override could then be dropped again. With interrupt remapping enabled, I'm surprised any of it works. Especially the broadcast EOI. Remember, the I/O APIC is just a device for turning line interrupts into MSIs. You couldn't tell from the Xen code, but there's fairly much no reason for the IOMMU interrupt remapping code, and the I/O APIC code, to know *anything* about each other. There is an upstream *target* of the MSIs generated from the I/O APIC. That's either an Intel IOMMU, and AMD IOMMU, or a standard APIC without remapping. You ask that upstream target to compose a standard MSI message (address within 0xFEExxxxx, data). Then you hand that to the I/O APIC and all it needs to do is swizzle those address and data bits into the right place in the RTE. For more information see https://lore.kernel.org/all/20201024213535.443185-22-dwmw2@xxxxxxxxxxxxx/ where I decoupled these in Linux, and a little light bedtime reading at http://david.woodhou.se/more-than-you-ever-wanted-to-know-about-x86-msis.txt The only caveat to this simplicity is the whole EOI thing. Historically, the I/O APIC would be told of an EOI for a given *vector*, and it would scan the RTEs for all its pins and clear the remote_irr bit for any level-triggered pin which matched that vector. But now, those low 8 bits of the RTE, which end up in the low 8 bits of the MSI data, aren't necessarily the 'vector' at all. For the Intel IOMMU remapping, those bits aren't used at all (so actually we can just put the vector in there, and keep the I/O APIC happy). But with an AMD IOMMU, the low bits of the MSI data are actually used as the index into the IOMMU's Interrupt Remapping Table. So *that* is the number which needs to be fed to the I/O APIC's explicit EOI register.... and good luck getting the CPU's local APIC to generate the right value for the broadcast! So... with the directed EOI, when __eoi_IO_APIC_irq() finds the *actual* vector with 'vector = desc->arch.vector', then calls __io_apic_eoi() with that value... ... and when __io_apic_eoi() then blindly writes that 'vector' to the I/O APIC EOI register, it's writing the wrong thing. At least with AMD interrupt remapping enabled. So I believe this 'fixes' the issue, when using directed EOI: --- a/xen/arch/x86/io_apic.c +++ b/xen/arch/x86/io_apic.c @@ -298,8 +298,15 @@ static void __io_apic_eoi(unsigned int apic, unsigned int vector, unsigned int p /* Prefer the use of the EOI register if available */ if ( ioapic_has_eoi_reg(apic) ) { - /* If vector is unknown, read it from the IO-APIC */ - if ( vector == IRQ_VECTOR_UNASSIGNED ) + /* If vector is unknown, read it from the IO-APIC. Also, + * with interrupt remapping, the field the IO-APIC thinks + * is the "vector" might be something completely different. + * With an AMD IOMMU it's the low bits of the IRTE index + * into the IOMMU's table, for example. So using the actual + * delivery vector which is stored in the IRTE would be wrong; + * we need to use the actual bits the IO-APIC sees in the RTE. + */ + if ( iommu_intremap || vector == IRQ_VECTOR_UNASSIGNED ) vector = __ioapic_read_entry(apic, pin, true).vector; *(IO_APIC_BASE(apic)+16) = vector; That's actually reading back from the RTE each time, which isn't ideal. I think what we should actually do is ensure there's a 1:1 mapping between the I/O APIC pin numbers and the IRTE indices (since the AMD IOMMU has a per-device IRT anyway), and then you can just write the pin# *instead* of the vector# to the EOI register. I don't really see how this would ever work with the "new" (that is, broadcast EOI) method. I'm confused about that part. I wonder if there's some AMD magic, like broadcast EOI ending *all* level interrupts?

Attachment: smime.p7s
Description: S/MIME cryptographic signature

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.