[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Re: [PATCH] xen: fix interrupt routing
On 14.06.2011, at 15:27, Stefano Stabellini wrote: > On Tue, 14 Jun 2011, Alexander Graf wrote: >>>>>>> static int i440fx_load_old(QEMUFile* f, void *opaque, int version_id) >>>>>>> { >>>>>>> PCII440FXState *d = opaque; >>>>>>> @@ -267,8 +263,17 @@ static PCIBus *i440fx_common_init(const char >>>>>>> *device_name, >>>>>>> d = pci_create_simple(b, 0, device_name); >>>>>>> *pi440fx_state = DO_UPCAST(PCII440FXState, dev, d); >>>>>>> >>>>>>> - piix3 = DO_UPCAST(PIIX3State, dev, >>>>>>> - pci_create_simple_multifunction(b, -1, true, >>>>>>> "PIIX3")); >>>>>>> + if (xen_enabled()) { >>>>>>> + piix3 = DO_UPCAST(PIIX3State, dev, >>>>>>> + pci_create_simple_multifunction(b, -1, true, >>>>>>> "PIIX3-xen")); >>>>>>> + pci_bus_irqs(b, xen_piix3_set_irq, xen_pci_slot_get_pirq, >>>>>>> + piix3, XEN_PIIX_NUM_PIRQS); >>>>>> >>>>>> But with XEN_PIIX_NUM_PIRQS it's not a piix3 anymore, no? What's the >>>>>> reason behind this change? >>>>> >>>>> It is still a piix3, but also provides non-legacy interrupt links to the >>>>> IO-APIC. >>>>> The four pins of each PCI device on the bus not only are routed to the >>>>> normal four pirqs (programmed writing to 0x60-0x63, see above) but also >>>>> they are connected to the IO-APIC directly. >>>>> These additional routes can only be discovered through ACPI, so you need >>>>> matching ACPI tables. We used to build the old ACPI tables like this: >>>>> >>>>> /* PRTA: APIC routing table (via non-legacy IOAPIC GSIs). */ >>>>> printf("Name(PRTA, Package() {\n"); >>>>> for ( dev = 1; dev < 32; dev++ ) >>>>> for ( intx = 0; intx < 4; intx++ ) /* INTA-D */ >>>>> printf("Package(){0x%04xffff, %u, 0, %u},\n", >>>>> dev, intx, ((dev*4+dev/8+intx)&31)+16); >>>>> printf("})\n"); >>>>> >>>> >>>> Interesting concept, but completely non-standard and very much >>>> different from real hardware. Please at least add a comment there to >>>> show readers that Xen is doing a hack which is not at all related to >>>> how the PIIX really works. >>> >>> Isn't this more a function of the "wires" on the motherboard than the >>> PIIX specifically? i.e. this just encodes the permutation of the wires >>> from the PCI slots into the IO-APIC input pins (bypassing the PIIX, >>> which is only used for legacy ISA IRQs i.e. by non-APIC aware OSes)? >> >> Interrupts with PCI work slightly different. PCI devices can map (themselves >> or by software) to one of 4 interrupt lines: INTA, INTB, INTC, INTD. These >> get converted using PCI host controller specific logic to 4 interrupt lines >> which then go into the IO-APIC. >> >> The IO-APIC is a chip with a limited number of pins. IIRC it was 24, could >> be 26 though. > > The number of redirection entries in the IOAPIC can be discovered > reading from the IOAPICVER register and it is a property of a specific > model of IOAPIC. As a matter of fact Xen's emulated IOAPIC supports more > pins than the most popular IOAPIC used with PIIX3. which means you're emulating hardware that never existed :). > > >> I haven't seen a single case where PCI devices have a direct link to the >> IO-APIC. I also have not seen any PCI host controller that exports more than >> 4 interrupts. Giving each PCI device its own line, on top of that more than >> ever could be in real hardware, is a plain hack IMHO. > > Actually this happens quite often: if I am not mistaken all the GSIs > higher than 15 are actually the result of a direct connection between > an interrupt source and the IOAPIC. I have several on my testboxes. Yes. "Interrupt source" meaning a wire on the board. I haven't seen any situation so far where you get direct IO-APIC connections to PCI _device_ pins. You obviously get plenty connections to PCI _bus_ pins. > Also give a look at the Intel Multiprocessor Specification, section > 3.6.2.3: as you can see from the diagram in "Symmetric I/O Mode" all the > interrupts are routed through the IOAPIC directly. > > >> Did this really give you actual performance/latency/scalability gains? I >> still think for devices that matter, we should go with MSI rather than >> deriving from real hw. >> > > Not all the operating systems support MSIs, it is nice to be able to > avoid interrupt sharing without recurring to MSIs. Yes and no. It's a tradeoff. If no interrupt sharing means that we emulate hardware that simply never could have existed the way we model it, I think it's a bad idea. > Also this is how Xen has been working for more then 5 years in HVM mode, > so this configuration is well tested and supported by most operating > systems (at least all the ones we tried so far). I'm fine with Xen breaking its own neck, as long as it doesn't affect non-Xen code paths. Just be aware that I'm not a huge fan of this approach :). > In any case I think it is a good idea to add a comment to better explain > what we are doing, see below. > > > > commit 973bb091a967fdec37a1bc8fe30d46a483d2903d > Author: Stefano Stabellini <stefano.stabellini@xxxxxxxxxxxxx> > Date: Tue May 17 12:10:36 2011 +0000 > > xen: fix interrupt routing > > - remove i440FX-xen and i440fx_write_config_xen > we don't need to intercept pci config writes to i440FX anymore; > > - introduce PIIX3-xen and piix3_write_config_xen > we do need to intercept pci config write to the PCI-ISA bridge to update > the PCI link routing; > > - set the number of PIIX3-xen interrupts line to 128; I still find it unpretty and I'm pretty sure it's completely different from real hardware, but since Xen code is your call and this doesn't affect non-Xen workloads, I won't block it, unless someone else is very much opposed to it. Please resend as proper patch. Alex _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |