[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v10 13/17] vpci: add initial support for virtual PCI bus topology
On Wed, 29 Nov 2023, Roger Pau Monné wrote: > On Tue, Nov 28, 2023 at 11:45:34PM +0000, Volodymyr Babchuk wrote: > > Hi Roger, > > > > Roger Pau Monné <roger.pau@xxxxxxxxxx> writes: > > > > > On Wed, Nov 22, 2023 at 01:18:32PM -0800, Stefano Stabellini wrote: > > >> On Wed, 22 Nov 2023, Roger Pau Monné wrote: > > >> > On Tue, Nov 21, 2023 at 05:12:15PM -0800, Stefano Stabellini wrote: > > >> > > Let me expand on this. Like I wrote above, I think it is important > > >> > > that > > >> > > Xen vPCI is the only in-use PCI Root Complex emulator. If it makes > > >> > > the > > >> > > QEMU implementation easier, it is OK if QEMU emulates an unneeded and > > >> > > unused PCI Root Complex. From Xen point of view, it doesn't exist. > > >> > > > > >> > > In terms if ioreq registration, QEMU calls > > >> > > xendevicemodel_map_pcidev_to_ioreq_server for each PCI BDF it wants > > >> > > to > > >> > > emulate. That way, Xen vPCI knows exactly what PCI config space > > >> > > reads/writes to forward to QEMU. > > >> > > > > >> > > Let's say that: > > >> > > - 00:02.0 is PCI passthrough device > > >> > > - 00:03.0 is a PCI emulated device > > >> > > > > >> > > QEMU would register 00:03.0 and vPCI would know to forward anything > > >> > > related to 00:03.0 to QEMU, but not 00:02.0. > > >> > > > >> > I think there's some work here so that we have a proper hierarchy > > >> > inside of Xen. Right now both ioreq and vpci expect to decode the > > >> > accesses to the PCI config space, and setup (MM)IO handlers to trap > > >> > ECAM, see vpci_ecam_{read,write}(). > > >> > > > >> > I think we want to move to a model where vPCI doesn't setup MMIO traps > > >> > itself, and instead relies on ioreq to do the decoding and forwarding > > >> > of accesses. We need some work in order to represent an internal > > >> > ioreq handler, but that shouldn't be too complicated. IOW: vpci > > >> > should register devices it's handling with ioreq, much like QEMU does. > > >> > > >> I think this could be a good idea. > > >> > > >> This would be the very first IOREQ handler implemented in Xen itself, > > >> rather than outside of Xen. Some code refactoring might be required, > > >> which worries me given that vPCI is at v10 and has been pending for > > >> years. I think it could make sense as a follow-up series, not v11. > > > > > > That's perfectly fine for me, most of the series here just deal with > > > the logic to intercept guest access to the config space and is > > > completely agnostic as to how the accesses are intercepted. > > > > > >> I think this idea would be beneficial if, in the example above, vPCI > > >> doesn't really need to know about device 00:03.0. vPCI registers via > > >> IOREQ the PCI Root Complex and device 00:02.0 only, QEMU registers > > >> 00:03.0, and everything works. vPCI is not involved at all in PCI config > > >> space reads and writes for 00:03.0. If this is the case, then moving > > >> vPCI to IOREQ could be good. > > > > > > Given your description above, with the root complex implemented in > > > vPCI, we would need to mandate vPCI together with ioreqs even if no > > > passthrough devices are using vPCI itself (just for the emulation of > > > the root complex). Which is fine, just wanted to mention the > > > dependency. > > > > > >> On the other hand if vPCI actually needs to know that 00:03.0 exists, > > >> perhaps because it changes something in the PCI Root Complex emulation > > >> or vPCI needs to take some action when PCI config space registers of > > >> 00:03.0 are written to, then I think this model doesn't work well. If > > >> this is the case, then I think it would be best to keep vPCI as MMIO > > >> handler and let vPCI forward to IOREQ when appropriate. > > > > > > At first approximation I don't think we would have such interactions, > > > otherwise the whole premise of ioreq being able to register individual > > > PCI devices would be broken. > > > > > > XenSever already has scenarios with two different user-space emulators > > > (ie: two different ioreq servers) handling accesses to different > > > devices in the same PCI bus, and there's no interaction with the root > > > complex required. Good to hear > > Out of curiosity: how legacy PCI interrupts are handled in this case? In > > my understanding, it is Root Complex's responsibility to propagate > > correct IRQ levels to an interrupt controller? > > I'm unsure whether my understanding of the question is correct, so my > reply might not be what you are asking for, sorry. > > Legacy IRQs (GSI on x86) are setup directly by the toolstack when the > device is assigned to the guest, using PHYSDEVOP_map_pirq + > XEN_DOMCTL_bind_pt_irq. Those hypercalls bind together a host IO-APIC > pin to a guest IO-APIC pin, so that interrupts originating from that > host IO-APIC pin are always forwarded to the guest an injected as > originating from the guest IO-APIC pin. > > Note that the device will always use the same IO-APIC pin, this is not > configured by the OS. QEMU calls xen_set_pci_intx_level which is implemented by xendevicemodel_set_pci_intx_level, which is XEN_DMOP_set_pci_intx_level, which does set_pci_intx_level. Eventually it calls hvm_pci_intx_assert and hvm_pci_intx_deassert. I don't think any of this goes via the Root Complex otherwise, like Roger pointed out, it wouldn't be possible to emulated individual PCI devices in separate IOREQ servers.
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |