[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC XEN PATCH 6/6] tools/libs/light: pci: translate irq to gsi



On Fri, 17 Mar 2023, Roger Pau Monné wrote:
> On Fri, Mar 17, 2023 at 11:15:37AM -0700, Stefano Stabellini wrote:
> > On Fri, 17 Mar 2023, Roger Pau Monné wrote:
> > > On Fri, Mar 17, 2023 at 09:39:52AM +0100, Jan Beulich wrote:
> > > > On 17.03.2023 00:19, Stefano Stabellini wrote:
> > > > > On Thu, 16 Mar 2023, Jan Beulich wrote:
> > > > >> So yes, it then all boils down to that Linux-
> > > > >> internal question.
> > > > > 
> > > > > Excellent question but we'll have to wait for Ray as he is the one 
> > > > > with
> > > > > access to the hardware. But I have this data I can share in the
> > > > > meantime:
> > > > > 
> > > > > [    1.260378] IRQ to pin mappings:
> > > > > [    1.260387] IRQ1 -> 0:1
> > > > > [    1.260395] IRQ2 -> 0:2
> > > > > [    1.260403] IRQ3 -> 0:3
> > > > > [    1.260410] IRQ4 -> 0:4
> > > > > [    1.260418] IRQ5 -> 0:5
> > > > > [    1.260425] IRQ6 -> 0:6
> > > > > [    1.260432] IRQ7 -> 0:7
> > > > > [    1.260440] IRQ8 -> 0:8
> > > > > [    1.260447] IRQ9 -> 0:9
> > > > > [    1.260455] IRQ10 -> 0:10
> > > > > [    1.260462] IRQ11 -> 0:11
> > > > > [    1.260470] IRQ12 -> 0:12
> > > > > [    1.260478] IRQ13 -> 0:13
> > > > > [    1.260485] IRQ14 -> 0:14
> > > > > [    1.260493] IRQ15 -> 0:15
> > > > > [    1.260505] IRQ106 -> 1:8
> > > > > [    1.260513] IRQ112 -> 1:4
> > > > > [    1.260521] IRQ116 -> 1:13
> > > > > [    1.260529] IRQ117 -> 1:14
> > > > > [    1.260537] IRQ118 -> 1:15
> > > > > [    1.260544] .................................... done.
> > > > 
> > > > And what does Linux think are IRQs 16 ... 105? Have you compared with
> > > > Linux running baremetal on the same hardware?
> > > 
> > > So I have some emails from Ray from he time he was looking into this,
> > > and on Linux dom0 PVH dmesg there is:
> > > 
> > > [    0.065063] IOAPIC[0]: apic_id 33, version 17, address 0xfec00000, GSI 
> > > 0-23
> > > [    0.065096] IOAPIC[1]: apic_id 34, version 17, address 0xfec01000, GSI 
> > > 24-55
> > > 
> > > So it seems the vIO-APIC data provided by Xen to dom0 is at least
> > > consistent.
> > >  
> > > > > And I think Ray traced the point in Linux where Linux gives us an IRQ 
> > > > > ==
> > > > > 112 (which is the one causing issues):
> > > > > 
> > > > > __acpi_register_gsi->
> > > > >         acpi_register_gsi_ioapic->
> > > > >                 mp_map_gsi_to_irq->
> > > > >                         mp_map_pin_to_irq->
> > > > >                                 __irq_resolve_mapping()
> > > > > 
> > > > >         if (likely(data)) {
> > > > >                 desc = irq_data_to_desc(data);
> > > > >                 if (irq)
> > > > >                         *irq = data->irq;
> > > > >                 /* this IRQ is 112, IO-APIC-34 domain */
> > > > >         }
> > > 
> > > 
> > > Could this all be a result of patch 4/5 in the Linux series ("[RFC
> > > PATCH 4/5] x86/xen: acpi registers gsi for xen pvh"), where a different
> > > __acpi_register_gsi hook is installed for PVH in order to setup GSIs
> > > using PHYSDEV ops instead of doing it natively from the IO-APIC?
> > > 
> > > FWIW, the introduced function in that patch
> > > (acpi_register_gsi_xen_pvh()) seems to unconditionally call
> > > acpi_register_gsi_ioapic() without checking if the GSI is already
> > > registered, which might lead to multiple IRQs being allocated for the
> > > same underlying GSI?
> > 
> > I understand this point and I think it needs investigating.
> > 
> > 
> > > As I commented there, I think that approach is wrong.  If the GSI has
> > > not been mapped in Xen (because dom0 hasn't unmasked the respective
> > > IO-APIC pin) we should add some logic in the toolstack to map it
> > > before attempting to bind.
> > 
> > But this statement confuses me. The toolstack doesn't get involved in
> > IRQ setup for PCI devices for HVM guests?
> 
> It does for GSI interrupts AFAICT, see pci_add_dm_done() and the call
> to xc_physdev_map_pirq().  I'm not sure whether that's a remnant that
> cold be removed (maybe for qemu-trad only?) or it's also required by
> QEMU upstream, I would have to investigate more.

You are right. I am not certain, but it seems like a mistake in the
toolstack to me. In theory, pci_add_dm_done should only be needed for PV
guests, not for HVM guests. I am not sure. But I can see the call to
xc_physdev_map_pirq you were referring to now.


> It's my understanding it's in pci_add_dm_done() where Ray was getting
> the mismatched IRQ vs GSI number.

I think the mismatch was actually caused by the xc_physdev_map_pirq call
from QEMU, which makes sense because in any case it should happen before
the same call done by pci_add_dm_done (pci_add_dm_done is called after
sending the pci passthrough QMP command to QEMU). So the first to hit
the IRQ!=GSI problem would be QEMU.

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.