[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v10 13/17] vpci: add initial support for virtual PCI bus topology
On Fri, 17 Nov 2023, Julien Grall wrote: > Hi Stefano, > > On 16/11/2023 23:28, Stefano Stabellini wrote: > > On Thu, 16 Nov 2023, Julien Grall wrote: > > > IIUC, this means that Xen will allocate the BDF. I think this will become > > > a > > > problem quite quickly as some of the PCI may need to be assigned at a > > > specific > > > vBDF (I have the intel graphic card in mind). > > > > > > Also, xl allows you to specificy the slot (e.g. <bdf>@<vslot>) which would > > > not > > > work with this approach. > > > > > > For dom0less passthrough, I feel the virtual BDF should always be > > > specified in > > > device-tree. When a domain is created after boot, then I think you want to > > > support <bdf>@<vslot> where <vslot> is optional. > > > > Hi Julien, > > > > I also think there should be a way to specify the virtual BDF, but if > > possible (meaning: it is not super difficult to implement) I think it > > would be very convenient if we could let Xen pick whatever virtual BDF > > Xen wants when the user doesn't specify the virtual BDF. That's > > because it would make it easier to specify the configuration for the > > user. Typically the user doesn't care about the virtual BDF, only to > > expose a specific host device to the VM. There are exceptions of course > > and that's why I think we should also have a way for the user to > > request a specific virtual BDF. One of these exceptions are integrated > > GPUs: the OS drivers used to have hardcoded BDFs. So it wouldn't work if > > the device shows up at a different virtual BDF compared to the host. > > If you let Xen allocating the vBDF, then wouldn't you need a way to tell the > toolstack/Device Models which vBDF was allocated? > > > > > Thinking more about this, one way to simplify the problem would be if we > > always reuse the physical BDF as virtual BDF for passthrough devices. I > > think that would solve the problem and makes it much more unlikely to > > run into drivers bugs. > > This works so long you have only one physical segment (i.e. hostbridge). If > you have multiple one, then you either have to expose multiple hostbridge to > the guest (which is not great) or need someone to allocate the vBDF. > > > > > And we allocate a "special" virtual BDF space for emulated devices, with > > the Root Complex still emulated in Xen. For instance, we could reserve > > ff:xx:xx. > Hmmm... Wouldn't this means reserving ECAM space for 256 buses? Obviously, we > could use 5 (just as random number). Yet, it still requires to reserve more > memory than necessary. > > > and in case of clashes we could refuse to continue. > > Urgh. And what would be the solution users triggering this clash? > > > Or we could > > allocate the first free virtual BDF, after all the pasthrough devices. > > This is only works if you don't want to support PCI hotplug. It may not be a > thing for embedded, but it is used by cloud. So you need a mechanism that > works with hotplug as well. > > > > > Example: > > - the user wants to assign physical 00:11.5 and b3:00.1 to the guest > > - Xen create virtual BDFs 00:11.5 and b3:00.1 for the passthrough devices > > - Xen allocates the next virtual BDF for emulated devices: b4:xx.x > > - If more virtual BDFs are needed for emulated devices, Xen allocates > > b5:xx.x > > > I still think, no matter the BDF allocation scheme, that we should try > > to avoid as much as possible to have two different PCI Root Complex > > emulators. Ideally we would have only one PCI Root Complex emulated by > > Xen. Having 2 PCI Root Complexes both of them emulated by Xen would be > > tolerable but not ideal. The worst case I would like to avoid is to have > > two PCI Root Complexes, one emulated by Xen and one emulated by QEMU. > > So while I agree that one emulated hostbridge is the best solution, I don't > think your proposal would work. As I wrote above, you may have a system with > multiple physical hostbridge. It would not be possible to assign two PCI > devices with the same BDF but from different segment. > > I agree unlikely, but if we can avoid it then it would be best. There are one > scheme which fits that: > 1. If the vBDF is not specified, then pick a free one. > 2. Otherwise check if the specified vBDF is free. If not return an error. > > This scheme should be used for both virtual and physical. This is pretty much > the algorithm used by QEMU today. It works, so what's would be the benefits to > do something different? I am OK with that. I was trying to find a way that could work without user intervention in almost 100% of the cases. I think both 1. and 2. you proposed are fine.
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |