[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH 2/2] xen/virtio: Avoid use of the dom0 backend in dom0
On Fri, Jul 07, 2023 at 04:10:14PM +0200, Juergen Gross wrote: > On 07.07.23 11:50, Roger Pau Monné wrote: > > On Fri, Jul 07, 2023 at 06:38:48AM +0200, Juergen Gross wrote: > > > On 06.07.23 23:49, Stefano Stabellini wrote: > > > > On Thu, 6 Jul 2023, Roger Pau Monné wrote: > > > > > On Wed, Jul 05, 2023 at 03:41:10PM -0700, Stefano Stabellini wrote: > > > > > > On Wed, 5 Jul 2023, Roger Pau Monné wrote: > > > > > > > On Tue, Jul 04, 2023 at 08:14:59PM +0300, Oleksandr Tyshchenko > > > > > > > wrote: > > > > > > > > Part 2 (clarification): > > > > > > > > > > > > > > > > I think using a special config space register in the root > > > > > > > > complex would > > > > > > > > not be terrible in terms of guest changes because it is easy to > > > > > > > > introduce a new root complex driver in Linux and other OSes. > > > > > > > > The root > > > > > > > > complex would still be ECAM compatible so the regular ECAM > > > > > > > > driver would > > > > > > > > still work. A new driver would only be necessary if you want to > > > > > > > > be able > > > > > > > > to access the special config space register. > > > > > > > > > > > > > > I'm slightly worry of this approach, we end up modifying a root > > > > > > > complex emulation in order to avoid modifying a PCI device > > > > > > > emulation > > > > > > > on QEMU, not sure that's a good trade off. > > > > > > > > > > > > > > Note also that different architectures will likely have different > > > > > > > root > > > > > > > complex, and so you might need to modify several of them, plus > > > > > > > then > > > > > > > arrange the PCI layout correctly in order to have the proper > > > > > > > hierarchy > > > > > > > so that devices belonging to different driver domains are > > > > > > > assigned to > > > > > > > different bridges. > > > > > > > > > > > > I do think that adding something to the PCI conf register somewhere > > > > > > is > > > > > > the best option because it is not dependent on ACPI and it is not > > > > > > dependent on xenstore both of which are very undesirable. > > > > > > > > > > > > I am not sure where specifically is the best place. These are 3 > > > > > > ideas > > > > > > we came up with: > > > > > > 1. PCI root complex > > > > > > 2. a register on the device itself > > > > > > 3. a new capability of the device > > > > > > 4. add one extra dummy PCI device for the sole purpose of exposing > > > > > > the > > > > > > grants capability > > > > > > > > > > > > > > > > > > Looking at the spec, there is a way to add a vendor-specific > > > > > > capability > > > > > > (cap_vndr = 0x9). Could we use that? It doesn't look like it is used > > > > > > today, Linux doesn't parse it. > > > > > > > > > > I did wonder the same from a quick look at the spec. There's however > > > > > a text in the specification that says: > > > > > > > > > > "The driver SHOULD NOT use the Vendor data capability except for > > > > > debugging and reporting purposes." > > > > > > > > > > So we would at least need to change that because the capability would > > > > > then be used by other purposes different than debugging and reporting. > > > > > > > > > > Seems like a minor adjustment, so might we worth asking upstream about > > > > > their opinion, and to get a conversation started. > > > > > > > > Wait, wouldn't this use-case fall under "reporting" ? It is exactly what > > > > we are doing, right? > > > > > > I'd understand "reporting" as e.g. logging, transferring statistics, ... > > > > > > We'd like to use it for configuration purposes. > > > > I've also read it that way. > > > > > Another idea would be to enhance the virtio IOMMU device to suit our > > > needs: > > > we could add the domid as another virtio IOMMU device capability and (for > > > now) > > > use bypass mode for all "productive" devices. > > > > If we have to start adding capabilties, won't it be easier to just add > > it to the each device instead of adding it to virtio IOMMU. Or is the > > parsing of capabilities device specific, and hence we would have to > > implement such parsing for each device? I would expect some > > capabilities are shared between all devices, and a Xen capability could > > be one of those. > > Have a look at [1], which is describing the common device config layout. > The problem here is that we'd need to add the domid after the queue specific > data, resulting in a mess if further queue fields would be added later. > > We could try that, of course. Right, we must make it part of the standard if we modify virtio_pci_common_cfg, or else newly added fields would overlap the Xen specific one. Would it be possible to signal Xen-grants support in the `device_feature` field, and then expose it from a vendor capability? IOW, would it be possible to add a Xen-specific hook in the parsing of virtio_pci_common_cfg that would then fetch additional data from a capability? That would likely be less intrusive than adding a new Xen-specific field to virtio_pci_common_cfg while still allowing us to do Xen specific configuration for all VirtIO devices. > > > > > Later we could even add grant-V3 support to Xen and to the virtio IOMMU > > > device > > > (see my last year Xen Summit design session). This could be usable for > > > disaggregated KVM setups, too, so I believe there is a chance to get this > > > accepted. > > > > > > > > > > > ********** > > > > > > > > What do you think about it? Are there any pitfalls, etc? This > > > > > > > > also requires > > > > > > > > system changes, but at least without virtio spec changes. > > > > > > > > > > > > > > Why are we so reluctant to add spec changes? I understand this > > > > > > > might > > > > > > > take time an effort, but it's the only way IMO to build a > > > > > > > sustainable > > > > > > > VirtIO Xen implementation. Did we already attempt to negotiate > > > > > > > with > > > > > > > Oasis Xen related spec changes and those where refused? > > > > > > > > > > > > That's because spec changes can be very slow. This is a bug that we > > > > > > need > > > > > > a relatively quick solution for and waiting 12-24 months for a spec > > > > > > update is not realistic. > > > > > > > > > > > > I think a spec change would be best as a long term solution. We also > > > > > > need a short term solution. The short term solution doesn't have to > > > > > > be > > > > > > ideal but it has to work now. > > > > > > > > > > My fear with such approach is that once a bodge is in place people > > > > > move on to other stuff and this never gets properly fixed. > > > > > > > > > > I know this might not be a well received opinion, but it would be > > > > > better if such bodge is kept in each interested party patchqueue for > > > > > the time being, until a proper solution is implemented. That way > > > > > there's an interest from parties into properly fixing it upstream. > > > > > > > > Unfortunately we are in the situation where we have an outstanding > > > > upstream bug, so we have to take action one way or the other. > > > > > > The required virtio IOMMU device modification would be rather small, so > > > adding it maybe under a CONFIG option defaulting to off might be > > > acceptable. > > > > Would you then do the grant allocation as part of virtio IOMMU? > > Long term, maybe. Do you remember my Grant-V3 design session last year? Being > able to reuse the same layout for virtio IOMMU was one of the basic ideas for > that layout (this would need some heavy work on the virtio IOMMU frontend and > backend, of course). While this might well be the best option, do we have anyone with the time and expertise to work on this? I might be wrong, but it seems like a huge task. Thanks, Roger.
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |