[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v6 4/5] [FUTURE] xen/arm: enable vPCI for domUs
On 12/5/23 06:08, Roger Pau Monné wrote: > On Mon, Dec 04, 2023 at 02:07:51PM -0800, Stefano Stabellini wrote: >> On Mon, 4 Dec 2023, Roger Pau Monné wrote: >>> On Fri, Dec 01, 2023 at 06:56:32PM -0800, Stefano Stabellini wrote: >>>> On Fri, 1 Dec 2023, Roger Pau Monné wrote: >>>>> On Mon, Nov 13, 2023 at 05:21:13PM -0500, Stewart Hildebrand wrote: >>>>>> @@ -1618,6 +1630,14 @@ int iommu_do_pci_domctl( >>>>>> bus = PCI_BUS(machine_sbdf); >>>>>> devfn = PCI_DEVFN(machine_sbdf); >>>>>> >>>>>> + if ( needs_vpci(d) && !has_vpci(d) ) >>>>>> + { >>>>>> + printk(XENLOG_G_WARNING "Cannot assign %pp to %pd: vPCI >>>>>> support not enabled\n", >>>>>> + &PCI_SBDF(seg, bus, devfn), d); >>>>>> + ret = -EPERM; >>>>>> + break; >>>>> >>>>> I think this is likely too restrictive going forward. The current >>>>> approach is indeed to enable vPCI on a per-domain basis because that's >>>>> how PVH dom0 uses it, due to being unable to use ioreq servers. >>>>> >>>>> If we start to expose vPCI suport to guests the interface should be on >>>>> a per-device basis, so that vPCI could be enabled for some devices, >>>>> while others could still be handled by ioreq servers. >>>>> >>>>> We might want to add a new flag to xen_domctl_assign_device (used by >>>>> XEN_DOMCTL_assign_device) in order to signal whether the device will >>>>> use vPCI. >>>> >>>> Actually I don't think this is a good idea. I am all for flexibility but >>>> supporting multiple different configurations comes at an extra cost for >>>> both maintainers and contributors. I think we should try to reduce the >>>> amount of configurations we support rather than increasing them >>>> (especially on x86 where we have PV, PVH, HVM). >>> >>> I think it's perfectly fine to initially require a domain to have all >>> its devices either passed through using vPCI or ireqs, but the >>> interface IMO should allow for such differentiation in the future. >>> That's why I think introducing a domain wide vPCI flag might not be >>> the best option going forward. >>> >>> It would be perfectly fine for XEN_DOMCTL_assign_device to set a >>> domain wide vPCI flag, iow: >>> >>> if ( HYPERCALL_VPCI_FLAG_SET && !has_vpci(d) ) >>> { >>> if ( has_arch_pdevs(d) ) >>> { >>> printk("All passthrough devices must use the same backend\n"); >>> return -EINVAL; >>> } >>> >>> /* Set vPCI domain flag */ >>> } >> >> That would be fine by me, but maybe we can avoid this change too. I was >> imagining that vPCI would be enabled at domain creation, not at runtime. >> And that vPCI would be enabled by default for all PVH guests (once we >> are past the initial experimental phase.) > > Then we don't even need a new CDF flag, and just enable vPCI when > IOMMU is enabled? IOW: we can key the enabling of vPCI to > XEN_DOMCTL_CDF_iommu for specific domain types? There are many Arm based platforms that need to use iommu but don't have (or don't use) PCI, so we'd still like to have a separate vPCI flag. > > Maybe that's not so trivial on x86, as there's no x86 PVH domain type > from the hypervisor PoV. > >> >>> We have already agreed that we want to aim for a setup where ioreqs >>> and vPCI could be used for the same domain, but I guess you assumed >>> that ioreqs would be used for emulated devices exclusively and vPCI >>> for passthrough devices? >> >> Yes, that's right >> >> >>> Overall if we agree that ioreqs and vPCI should co-exist for a domain, >>> I'm not sure there's much reason to limit ioreqs to only handle >>> emulated devices, seems like an arbitrary limitation. >> >> Reply below >> >> >>>> I don't think we should enable IOREQ servers to handle PCI passthrough >>>> for PVH guests and/or guests with vPCI. If the domain has vPCI, PCI >>>> Passthrough can be handled by vPCI just fine. I think this should be a >>>> good anti-feature to have (a goal to explicitly not add this feature) to >>>> reduce complexity. Unless you see a specific usecase to add support for >>>> it? >>> >>> There are passthrough devices (GPUs) that might require some extra >>> mediation on dom0 (like the Intel GVT-g thing) that would force the >>> usage of ioreqs to passthrough. >> >> From an architectural perspective, I think it would be cleaner, simpler >> to maintain, and simpler to understand if Xen was the sole owner of the >> PCI Root Complex and PCI config space mediation (implemented with vPCI). >> IOREQ can be used for emulation and it works very well for that. At >> least in my mind, that makes things much simpler. > > But IOREQ already has all the code to mediate accesses to the PCI > config space, and the interface to register separate servers for > different PCI devices. > > We would then need to duplicate this internally for vPCI, so that vPCI > could forward accesses to IOREQ just for IOREQ to forward to yet a > different component? Seems like a lot of duplication for no benefit. > >> I understand there are non-trivial cases, like virtual GPUs with >> hardware access, but I don't classify those as passthrough. That's >> because there isn't one device that gets fully assigned to the guest. >> Instead, there is an emulated device (hence IOREQ) with certain MMIO >> regions and interrupts that end up being directly mapped from real >> hardware. >> >> So I think it is natural in those cases to use IOREQ and it is also >> natural to have QEMU remap MMIO/IRQs at runtime. From a vPCI >> perspective, I hope it will mostly look as if the device is assigned to >> Dom0. Even if it ends up being more complex than that, Rome wasn't >> built in one day, and I don't think we should try to solve this problem >> on day1 (as long as the interfaces are not stable interfaces). > > I don't see IOREQ as dealing explicitly with emulation. Yes, it does > allow for emulators to be implemented in user-space, but at the end > it's just an interface that allows forwarding accesses to certain > resources (for the case we are speaking about here, PCI config space) > to entities that registered as handlers. > > vPCI OTOH just deals with a very specific resource (PCI config space) > and only allows internal handlers to be registered on a byte > granularity. > > So your proposal would be to implement a hierarchy like the one on the > diagram below: > > ┌────────┐ ┌──────────┐ ┌──────────────────┐ > │ Memory │ │ IO Ports │ │ PCI config space │ > └───────┬┘ └┬─────────┘ └───┬──────────────┘ > │ │ │ > │ │ ┌───┴──┐ > │ │ │ vPCI │ > │ │ └─┬──┬─┘ > ┌──┴───┴┐ │ │ > │ IOREQ ├────────────┘ │ > └────┬──┘ │ > │ │ > ┌────────────┴──┐ ┌┴──────────────┐ > │ IOREQ servers │ │ vPCI handlers │ > └───────────────┘ └───────────────┘ > > While what I'm proposing would look like: > > ┌────────┐ ┌──────────┐ ┌──────────────────┐ > │ Memory │ │ IO Ports │ │ PCI config space │ > └────┬───┘ └────┬─────┘ └────────┬─────────┘ > │ │ │ > └─────┬────┴────┬───────────┘ > │ IOREQ │ > └─┬─────┬─┘ > │ │ > ┌───────────────┤ └────┬──────┐ > │ IOREQ servers │ │ vPCI │ > └───────────────┘ └───┬──┘ > │ > ┌───┴───────────┐ > │ vPCI handlers │ > └───────────────┘ > > I'm obviously biased, but I think the latter is cleaner, and allows > all resources to be arbitrated by the same component (IOREQ). > > If the concern is about the IOREQ hypercall interface, it would be > fine to introduce an option that limit IOREQs to internal users > (vPCI) without supporting external IOREQ servers. > > Think of IOREQ as a resource mediator inside of Xen, that just does > the PCI address decoding and forwards the access to the interested > party, either an external IOREQ server or vPCI. > >> >>> It's important that the interfaces we introduce are correct IMO, >>> because that ends up reflecting on the configuration options that we >>> expose to users on xl/libxl. While both XEN_DOMCTL_createdomain and >>> XEN_DOMCTL_assign_device are unstable interfaces, how the vPCI option >>> gets placed there will ultimately influence how the option gets >>> exposed in xl/libxl, and the interface there is relevant to keep >>> stable for end user sanity. >> >> I agree with you on the stable interfaces. The important part is not to >> introduce changes to stable interfaces that could limit us in the >> future. Specifically that includes xl and libxl, we need to be careful >> there. But I don't see a single per-domain vPCI enable/disable option as >> a problem. Let's say that in the future we have a mediated vGPU >> implementation: if it works together with vPCI then the per-domain vPCI >> option in libxl will be enabled (either explicitely or by default), if >> it doesn't then vPCI will be disabled (either explicitely or by the >> newer vGPU options.) > > If vPCI is hooked into IOREQ there won't be a need anymore to register > the vPCI config space traps, as that would be done by IOREQ, and hence > vPCI managed devices could be registered at runtime with IOREQ. IOW: > there won't be a need anymore to signal at domain creation whether > vPCI is intended to be used or not. > > We would obviously need to enable IOREQ for all domains with IOMMU > enabled, as it would be IOREQ that register the PCI config space > handlers. > >> For *unstable* interfaces (XEN_DOMCTL_assign_device) I would rather wait >> before adding more changes on top of them, not because I don't care >> about the mediated GPU problem (we do have something similar at AMD), >> but because I worry that if we try to change them now we might not do a >> good enough job. I would prefer to wait until we know more about the >> actual use case, ideally with code supporting it. >> >> I think the difference in points of views comes from the fact that I see >> vPCI as the default, QEMU only as a limited peripheral emulator (or >> mediator for the vGPU case) but not in control. vPCI and QEMU are not >> equal in my view. vPCI is in charge and always present if not in very >> uncommon setups (even if we decide to hook it inside Xen by using >> internal IOREQ interfaces). QEMU might come and go. > > Xen needs a single component that mediates accesses to resources, > whether that's IOREQ, or something else I don't really care that much. > Having vPCI mediate accesses to the PCI config space, and IOREQ to the > memory (and on x86 IO port) space just seems awfully complicated for > AFAICT no real benefit. > > Also, you seem to confabulate IOREQ with QEMU, while the latter is > indeed an user of IOREQ, I do see IOREQ as a simple resource mediator > inside of Xen, that has the ability to forward such accesses to > external emulators using an hypercall interface. > >> Now that I am writing this, I realize this is also why I wasn't too >> happy with the idea of hooking vPCI using IOREQ. It makes them look as >> if they are the same, while I don't they should be considered at the >> same level of priority, criticality, safety, integration in the system, >> etc. > > I feel there are some fears with IOREQ from a safety PoV? The code > that does the resource multiplexing is small, and as said above if > there are safety concerns with the hypercall interface it would be > fine to limit it's usage to internal handlers only. > > Thanks, Roger.
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |