[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [BUG]i2c_hid_acpi broken with 4.17.2 on Framework Laptop 13 AMD



On Thu, 2024-03-07 at 09:39 +0100, Jan Beulich wrote:
> On 06.03.2024 18:28, Sébastien Chaumat wrote:
> > Reasoning backward  (using a  kernel without the pinctrl_amd driver to
> > > ensure xen only is at stake) :
> > >  checking the diff in IOAPIC  between bare metal and xen  (IRQ7 is on
> > > pin07 on APIC )
> > > 
> > > using kernel argument : apic=debug
> > > 
> > > bare metal :
> > > [    0.715330] fedora kernel: ... APIC VERSION: 81050010
> > > ...
> > > [    0.715433] fedora kernel:  pin07, disabled, edge , high, V(00),
> > > IRR(0), S(0), physical, D(0000), M(0)
> > > 
> > > xen :
> > > [    2.249582] fedora kernel: ... APIC VERSION: 00000014
> > > ...
> > > [    2.249730] fedora kernel:  pin07, disabled, level, low , V(60),
> > > IRR(0), S(0), physical, D(0000), M(0)
> > > 
> > > So the APIC table is not the same.
> > > 
> > > As strange as it looks the  (IOAPIC 0) pin07 is correctly described by the
> > > APIC in xen but yet differently than in baremetal.
> > > But the APIC message comes long after the
> > > [    1.833145] fedora kernel: xen: registering gsi 7 triggering 0 polarity
> > > 1
> > > 
> > > so I wonder if the APIC pin07 info had any influence.
> > > 
> > > Finally found the fix : adding ioapic_ack=new to xen boot parameters.
> > Not only the trackpad is now working but also the ACPI Embedded Controller
> > which is completely disabled.
> 
> Hmm, interesting. From someone else's laptop many years ago I had actually
> an indication in the opposite direction: That didn't work because of our
> defaulting to new (no directed EOI in sight yet back at that time). I
> wonder if overriding the ack method isn't actually just papering over the
> underlying actual issue here, whatever that is. IOW with the edge vs level
> mismatch addressed I'd hope the override could then be dropped again.

With interrupt remapping enabled, I'm surprised any of it works.
Especially the broadcast EOI.

Remember, the I/O APIC is just a device for turning line interrupts
into MSIs. You couldn't tell from the Xen code, but there's fairly much
no reason for the IOMMU interrupt remapping code, and the I/O APIC
code, to know *anything* about each other.

There is an upstream *target* of the MSIs generated from the I/O APIC.
That's either an Intel IOMMU, and AMD IOMMU, or a standard APIC without
remapping. You ask that upstream target to compose a standard MSI
message (address within 0xFEExxxxx, data). 

Then you hand that to the I/O APIC and all it needs to do is swizzle
those address and data bits into the right place in the RTE.

For more information see 
https://lore.kernel.org/all/20201024213535.443185-22-dwmw2@xxxxxxxxxxxxx/
where I decoupled these in Linux, and a little light bedtime reading at
http://david.woodhou.se/more-than-you-ever-wanted-to-know-about-x86-msis.txt

The only caveat to this simplicity is the whole EOI thing.
Historically, the I/O APIC would be told of an EOI for a given
*vector*, and it would scan the RTEs for all its pins and clear the
remote_irr bit for any level-triggered pin which matched that vector.

But now, those low 8 bits of the RTE, which end up in the low 8 bits of
the MSI data, aren't necessarily the 'vector' at all. For the Intel
IOMMU remapping, those bits aren't used at all (so actually we can just
put the vector in there, and keep the I/O APIC happy).

But with an AMD IOMMU, the low bits of the MSI data are actually used
as the index into the IOMMU's Interrupt Remapping Table. So *that* is
the number which needs to be fed to the I/O APIC's explicit EOI
register.... and good luck getting the CPU's local APIC to generate the
right value for the broadcast!

So... with the directed EOI, when __eoi_IO_APIC_irq() finds the
*actual* vector with 'vector = desc->arch.vector', then calls
__io_apic_eoi() with that value...

... and when __io_apic_eoi() then blindly writes that 'vector' to the
I/O APIC EOI register, it's writing the wrong thing. At least with AMD
interrupt remapping enabled.

So I believe this 'fixes' the issue, when using directed EOI: 

--- a/xen/arch/x86/io_apic.c
+++ b/xen/arch/x86/io_apic.c
@@ -298,8 +298,15 @@ static void __io_apic_eoi(unsigned int apic, unsigned int 
vector, unsigned int p
     /* Prefer the use of the EOI register if available */
     if ( ioapic_has_eoi_reg(apic) )
     {
-        /* If vector is unknown, read it from the IO-APIC */
-        if ( vector == IRQ_VECTOR_UNASSIGNED )
+        /* If vector is unknown, read it from the IO-APIC. Also,
+         * with interrupt remapping, the field the IO-APIC thinks
+         * is the "vector" might be something completely different.
+         * With an AMD IOMMU it's the low bits of the IRTE index
+         * into the IOMMU's table, for example. So using the actual
+         * delivery vector which is stored in the IRTE would be wrong;
+         * we need to use the actual bits the IO-APIC sees in the RTE.
+         */
+        if ( iommu_intremap || vector == IRQ_VECTOR_UNASSIGNED )
             vector = __ioapic_read_entry(apic, pin, true).vector;
 
         *(IO_APIC_BASE(apic)+16) = vector;


That's actually reading back from the RTE each time, which isn't ideal.
I think what we should actually do is ensure there's a 1:1 mapping
between the I/O APIC pin numbers and the IRTE indices (since the AMD
IOMMU has a per-device IRT anyway), and then you can just write the
pin# *instead* of the vector# to the EOI register.

I don't really see how this would ever work with the "new" (that is,
broadcast EOI) method. I'm confused about that part. I wonder if
there's some AMD magic, like broadcast EOI ending *all* level
interrupts?



Attachment: smime.p7s
Description: S/MIME cryptographic signature


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.