[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH 06/10] vpci: Make every domain handle its own BARs
On 11/13/20 12:25 PM, Jan Beulich wrote: > On 13.11.2020 07:32, Oleksandr Andrushchenko wrote: >> On 11/12/20 4:46 PM, Roger Pau Monné wrote: >>> On Thu, Nov 12, 2020 at 01:16:10PM +0000, Oleksandr Andrushchenko wrote: >>>> On 11/12/20 11:40 AM, Roger Pau Monné wrote: >>>>> On Mon, Nov 09, 2020 at 02:50:27PM +0200, Oleksandr Andrushchenko wrote: >>>>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@xxxxxxxx> >>>>>> +static uint32_t bar_read_dispatch(const struct pci_dev *pdev, unsigned >>>>>> int reg, >>>>>> + void *data) >>>>>> +{ >>>>>> + struct vpci_bar *vbar, *bar = data; >>>>>> + >>>>>> + if ( is_hardware_domain(current->domain) ) >>>>>> + return bar_read_hwdom(pdev, reg, data); >>>>>> + >>>>>> + vbar = get_vpci_bar(current->domain, pdev, bar->index); >>>>>> + if ( !vbar ) >>>>>> + return ~0; >>>>>> + >>>>>> + return bar_read_guest(pdev, reg, vbar); >>>>>> +} >>>>>> + >>>>>> +static void bar_write_dispatch(const struct pci_dev *pdev, unsigned int >>>>>> reg, >>>>>> + uint32_t val, void *data) >>>>>> +{ >>>>>> + struct vpci_bar *bar = data; >>>>>> + >>>>>> + if ( is_hardware_domain(current->domain) ) >>>>>> + bar_write_hwdom(pdev, reg, val, data); >>>>>> + else >>>>>> + { >>>>>> + struct vpci_bar *vbar = get_vpci_bar(current->domain, pdev, >>>>>> bar->index); >>>>>> + >>>>>> + if ( !vbar ) >>>>>> + return; >>>>>> + bar_write_guest(pdev, reg, val, vbar); >>>>>> + } >>>>>> +} >>>>> You should assign different handlers based on whether the domain that >>>>> has the device assigned is a domU or the hardware domain, rather than >>>>> doing the selection here. >>>> Hm, handlers are assigned once in init_bars and this function is only >>>> called >>>> >>>> for hwdom, so there is no way I can do that for the guests. Hence, the >>>> dispatcher >>> I think we might want to reset the vPCI handlers when a devices gets >>> assigned and deassigned. >> In ARM case init_bars is called too early: PCI device assignment is currently >> >> initiated by Domain-0' kernel and is done *before* PCI devices are given >> memory >> >> ranges and BARs assigned: >> >> [ 0.429514] pci_bus 0000:00: root bus resource [bus 00-ff] >> [ 0.429532] pci_bus 0000:00: root bus resource [io 0x0000-0xfffff] >> [ 0.429555] pci_bus 0000:00: root bus resource [mem 0xfe200000-0xfe3fffff] >> [ 0.429575] pci_bus 0000:00: root bus resource [mem 0x30000000-0x37ffffff] >> [ 0.429604] pci_bus 0000:00: root bus resource [mem 0x38000000-0x3fffffff >> pref] >> [ 0.429670] pci 0000:00:00.0: enabling Extended Tags >> [ 0.453764] pci 0000:00:00.0: -------------------- BUS_NOTIFY_ADD_DEVICE >> >> < init_bars > >> >> [ 0.453793] pci 0000:00:00.0: -- IRQ 0 >> [ 0.458825] pci 0000:00:00.0: Failed to add - passthrough or MSI/MSI-X >> might fail! >> [ 0.471790] pci 0000:01:00.0: -------------------- BUS_NOTIFY_ADD_DEVICE >> >> < init_bars > >> >> [ 0.471821] pci 0000:01:00.0: -- IRQ 255 >> [ 0.476809] pci 0000:01:00.0: Failed to add - passthrough or MSI/MSI-X >> might fail! >> >> < BAR assignments below > >> >> [ 0.488233] pci 0000:00:00.0: BAR 14: assigned [mem 0xfe200000-0xfe2fffff] >> [ 0.488265] pci 0000:00:00.0: BAR 15: assigned [mem 0x38000000-0x380fffff >> pref] >> >> In case of x86 this is pretty much ok as BARs are already in place, but for >> ARM we >> >> need to take care and re-setup vPCI BARs for hwdom. > Even on x86 there's no guarantee that all devices have their BARs set > up by firmware. This is true. But there you could have config space trapped in "x86 generic way", please correct me if I'm wrong here > > In a subsequent reply you've suggested to move init_bars from "add" to > "assign", but I'm having trouble seeing what this would change: It's > not Dom0 controlling assignment (to itself), but Xen assigns the device > towards the end of pci_add_device(). PHYSDEVOP_pci_device_add vs XEN_DOMCTL_assign_device Currently we initialize BARs during PHYSDEVOP_pci_device_add and if we do that during XEN_DOMCTL_assign_device things seem to change > >> Things are getting even more >> >> complicated if the host PCI bridge is not ECAM like, so you cannot set >> mmio_handlers >> >> and trap hwdom's access to the config space to update BARs etc. This is why >> I have that >> >> ugly hack for rcar_gen3 to read actual BARs for hwdom. > How to config space accesses work there? The latest for MSI/MSI-X it'll > be imperative that Xen be able to intercept config space writes. > >>> In order to do passthrough to domUs safely >>> we will have to add more handlers than what's required for dom0, >> Can you please tell what are thinking about? What are the missing handlers? >>> and >>> having is_hardware_domain sprinkled in all of them is not a suitable >>> solution. >> I'll try to replace is_hardware_domain with something like: >> >> +/* >> + * Detect whether physical PCI devices in this segment belong >> + * to the domain given, e.g. on x86 all PCI devices live in hwdom, >> + * but in case of ARM this might not be the case: those may also >> + * live in driver domains or even Xen itself. >> + */ >> +bool pci_is_hardware_domain(struct domain *d, u16 seg) >> +{ >> +#ifdef CONFIG_X86 >> + return is_hardware_domain(d); >> +#elif CONFIG_ARM >> + return pci_is_owner_domain(d, seg); >> +#else >> +#error "Unsupported architecture" >> +#endif >> +} >> + >> +/* >> + * Get domain which owns this segment: for x86 this is always hardware >> + * domain and for ARM this can be different. >> + */ >> +struct domain *pci_get_hardware_domain(u16 seg) >> +{ >> +#ifdef CONFIG_X86 >> + return hardware_domain; >> +#elif CONFIG_ARM >> + return pci_get_owner_domain(seg); >> +#else >> +#error "Unsupported architecture" >> +#endif >> +} >> >> This is what I use to properly detect the domain that really owns physical >> host bridge > I consider this problematic. We should try to not let Arm's and x86'es > PCI implementations diverge too much, i.e. at least the underlying basic > model would better be similar. For example, if entire segments can be > assigned to a driver domain on Arm, why should the same not be possible > on x86? Good question, probably in this case x86 == ARM and I can use pci_is_owner_domain for both architectures instead of using is_hardware_domain for x86 > > Furthermore I'm suspicious about segments being the right granularity > here. On ia64 multiple host bridges could (and typically would) live > on segment 0. Iirc I had once seen output from an x86 system which was > apparently laid out similarly. Therefore, just like for MCFG, I think > the granularity wants to be bus ranges within a segment. Can you please suggest something we can use as a hint for such a detection logic? > > Jan Thank you, Oleksandr
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |