[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH 06/10] vpci: Make every domain handle its own BARs
On 11/13/20 1:35 PM, Jan Beulich wrote: > On 13.11.2020 12:02, Oleksandr Andrushchenko wrote: >> On 11/13/20 12:50 PM, Jan Beulich wrote: >>> On 13.11.2020 11:46, Oleksandr Andrushchenko wrote: >>>> On 11/13/20 12:25 PM, Jan Beulich wrote: >>>>> On 13.11.2020 07:32, Oleksandr Andrushchenko wrote: >>>>>> On 11/12/20 4:46 PM, Roger Pau Monné wrote: >>>>>>> On Thu, Nov 12, 2020 at 01:16:10PM +0000, Oleksandr Andrushchenko wrote: >>>>>>>> On 11/12/20 11:40 AM, Roger Pau Monné wrote: >>>>>>>>> On Mon, Nov 09, 2020 at 02:50:27PM +0200, Oleksandr Andrushchenko >>>>>>>>> wrote: >>>>>>>>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@xxxxxxxx> >>>>>>>>>> +static uint32_t bar_read_dispatch(const struct pci_dev *pdev, >>>>>>>>>> unsigned int reg, >>>>>>>>>> + void *data) >>>>>>>>>> +{ >>>>>>>>>> + struct vpci_bar *vbar, *bar = data; >>>>>>>>>> + >>>>>>>>>> + if ( is_hardware_domain(current->domain) ) >>>>>>>>>> + return bar_read_hwdom(pdev, reg, data); >>>>>>>>>> + >>>>>>>>>> + vbar = get_vpci_bar(current->domain, pdev, bar->index); >>>>>>>>>> + if ( !vbar ) >>>>>>>>>> + return ~0; >>>>>>>>>> + >>>>>>>>>> + return bar_read_guest(pdev, reg, vbar); >>>>>>>>>> +} >>>>>>>>>> + >>>>>>>>>> +static void bar_write_dispatch(const struct pci_dev *pdev, unsigned >>>>>>>>>> int reg, >>>>>>>>>> + uint32_t val, void *data) >>>>>>>>>> +{ >>>>>>>>>> + struct vpci_bar *bar = data; >>>>>>>>>> + >>>>>>>>>> + if ( is_hardware_domain(current->domain) ) >>>>>>>>>> + bar_write_hwdom(pdev, reg, val, data); >>>>>>>>>> + else >>>>>>>>>> + { >>>>>>>>>> + struct vpci_bar *vbar = get_vpci_bar(current->domain, pdev, >>>>>>>>>> bar->index); >>>>>>>>>> + >>>>>>>>>> + if ( !vbar ) >>>>>>>>>> + return; >>>>>>>>>> + bar_write_guest(pdev, reg, val, vbar); >>>>>>>>>> + } >>>>>>>>>> +} >>>>>>>>> You should assign different handlers based on whether the domain that >>>>>>>>> has the device assigned is a domU or the hardware domain, rather than >>>>>>>>> doing the selection here. >>>>>>>> Hm, handlers are assigned once in init_bars and this function is only >>>>>>>> called >>>>>>>> >>>>>>>> for hwdom, so there is no way I can do that for the guests. Hence, the >>>>>>>> dispatcher >>>>>>> I think we might want to reset the vPCI handlers when a devices gets >>>>>>> assigned and deassigned. >>>>>> In ARM case init_bars is called too early: PCI device assignment is >>>>>> currently >>>>>> >>>>>> initiated by Domain-0' kernel and is done *before* PCI devices are given >>>>>> memory >>>>>> >>>>>> ranges and BARs assigned: >>>>>> >>>>>> [ 0.429514] pci_bus 0000:00: root bus resource [bus 00-ff] >>>>>> [ 0.429532] pci_bus 0000:00: root bus resource [io 0x0000-0xfffff] >>>>>> [ 0.429555] pci_bus 0000:00: root bus resource [mem >>>>>> 0xfe200000-0xfe3fffff] >>>>>> [ 0.429575] pci_bus 0000:00: root bus resource [mem >>>>>> 0x30000000-0x37ffffff] >>>>>> [ 0.429604] pci_bus 0000:00: root bus resource [mem >>>>>> 0x38000000-0x3fffffff pref] >>>>>> [ 0.429670] pci 0000:00:00.0: enabling Extended Tags >>>>>> [ 0.453764] pci 0000:00:00.0: -------------------- >>>>>> BUS_NOTIFY_ADD_DEVICE >>>>>> >>>>>> < init_bars > >>>>>> >>>>>> [ 0.453793] pci 0000:00:00.0: -- IRQ 0 >>>>>> [ 0.458825] pci 0000:00:00.0: Failed to add - passthrough or >>>>>> MSI/MSI-X might fail! >>>>>> [ 0.471790] pci 0000:01:00.0: -------------------- >>>>>> BUS_NOTIFY_ADD_DEVICE >>>>>> >>>>>> < init_bars > >>>>>> >>>>>> [ 0.471821] pci 0000:01:00.0: -- IRQ 255 >>>>>> [ 0.476809] pci 0000:01:00.0: Failed to add - passthrough or >>>>>> MSI/MSI-X might fail! >>>>>> >>>>>> < BAR assignments below > >>>>>> >>>>>> [ 0.488233] pci 0000:00:00.0: BAR 14: assigned [mem >>>>>> 0xfe200000-0xfe2fffff] >>>>>> [ 0.488265] pci 0000:00:00.0: BAR 15: assigned [mem >>>>>> 0x38000000-0x380fffff pref] >>>>>> >>>>>> In case of x86 this is pretty much ok as BARs are already in place, but >>>>>> for ARM we >>>>>> >>>>>> need to take care and re-setup vPCI BARs for hwdom. >>>>> Even on x86 there's no guarantee that all devices have their BARs set >>>>> up by firmware. >>>> This is true. But there you could have config space trapped in "x86 >>>> generic way", >>>> >>>> please correct me if I'm wrong here >>>> >>>>> In a subsequent reply you've suggested to move init_bars from "add" to >>>>> "assign", but I'm having trouble seeing what this would change: It's >>>>> not Dom0 controlling assignment (to itself), but Xen assigns the device >>>>> towards the end of pci_add_device(). >>>> PHYSDEVOP_pci_device_add vs XEN_DOMCTL_assign_device >>>> >>>> Currently we initialize BARs during PHYSDEVOP_pci_device_add and >>>> >>>> if we do that during XEN_DOMCTL_assign_device things seem to change >>> But there can't possibly be any XEN_DOMCTL_assign_device involved in >>> booting of Dom0. >> Indeed. So, do you have an idea when we should call init_bars suitable >> >> for both ARM and x86? >> >> Another question is: what happens bad if x86 and ARM won't call init_bars >> >> until the moment we really assign a PCI device to the first guest? > I'd like to answer the latter question first: How would Dom0 use > the device prior to such an assignment? As an implication to the > presumed answer here, I guess init_bars could be deferred up until > the first time Dom0 (or more generally the possessing domain) > accesses any of them. Similarly, devices used by Xen itself could > have this done immediately before first use. This may require > tracking on a per-device basis whether initialization was done. Ok, I'll try to look into it > >>>>>>> In order to do passthrough to domUs safely >>>>>>> we will have to add more handlers than what's required for dom0, >>>>>> Can you please tell what are thinking about? What are the missing >>>>>> handlers? >>>>>>> and >>>>>>> having is_hardware_domain sprinkled in all of them is not a suitable >>>>>>> solution. >>>>>> I'll try to replace is_hardware_domain with something like: >>>>>> >>>>>> +/* >>>>>> + * Detect whether physical PCI devices in this segment belong >>>>>> + * to the domain given, e.g. on x86 all PCI devices live in hwdom, >>>>>> + * but in case of ARM this might not be the case: those may also >>>>>> + * live in driver domains or even Xen itself. >>>>>> + */ >>>>>> +bool pci_is_hardware_domain(struct domain *d, u16 seg) >>>>>> +{ >>>>>> +#ifdef CONFIG_X86 >>>>>> + return is_hardware_domain(d); >>>>>> +#elif CONFIG_ARM >>>>>> + return pci_is_owner_domain(d, seg); >>>>>> +#else >>>>>> +#error "Unsupported architecture" >>>>>> +#endif >>>>>> +} >>>>>> + >>>>>> +/* >>>>>> + * Get domain which owns this segment: for x86 this is always hardware >>>>>> + * domain and for ARM this can be different. >>>>>> + */ >>>>>> +struct domain *pci_get_hardware_domain(u16 seg) >>>>>> +{ >>>>>> +#ifdef CONFIG_X86 >>>>>> + return hardware_domain; >>>>>> +#elif CONFIG_ARM >>>>>> + return pci_get_owner_domain(seg); >>>>>> +#else >>>>>> +#error "Unsupported architecture" >>>>>> +#endif >>>>>> +} >>>>>> >>>>>> This is what I use to properly detect the domain that really owns >>>>>> physical host bridge >>>>> I consider this problematic. We should try to not let Arm's and x86'es >>>>> PCI implementations diverge too much, i.e. at least the underlying basic >>>>> model would better be similar. For example, if entire segments can be >>>>> assigned to a driver domain on Arm, why should the same not be possible >>>>> on x86? >>>> Good question, probably in this case x86 == ARM and I can use >>>> >>>> pci_is_owner_domain for both architectures instead of using >>>> is_hardware_domain for x86 >>>> >>>>> Furthermore I'm suspicious about segments being the right granularity >>>>> here. On ia64 multiple host bridges could (and typically would) live >>>>> on segment 0. Iirc I had once seen output from an x86 system which was >>>>> apparently laid out similarly. Therefore, just like for MCFG, I think >>>>> the granularity wants to be bus ranges within a segment. >>>> Can you please suggest something we can use as a hint for such a detection >>>> logic? >>> The underlying information comes from ACPI tables, iirc. I don't >>> recall the details, though - sorry. >> Ok, so seg + bus should be enough for both ARM and Xen then, right? >> >> pci_get_hardware_domain(u16 seg, u8 bus) > Whether an individual bus number can suitable express things I can't > tell; I did say bus range, but of course if you care about just > individual devices, then a single bus number will of course do. I can implement the lookup whether a PCI host bridge owned by a particular domain with something like: struct pci_host_bridge *bridge = pci_find_host_bridge(seg, bus); return bridge->dt_node->used_by == d->domain_id; Could you please give me a hint how this can be done on x86? > > Jan Thank you, Oleksandr
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |