[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 06/10] vpci: Make every domain handle its own BARs


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Oleksandr Andrushchenko <Oleksandr_Andrushchenko@xxxxxxxx>
  • Date: Fri, 13 Nov 2020 11:02:21 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=epam.com; dmarc=pass action=none header.from=epam.com; dkim=pass header.d=epam.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=WEL+sGuSsto8HF1T0TQocDqYMB5lhW+ndKvHHZ1o/Sw=; b=V1JMXNmoc4uy2U0Kci3p1DRvp/OLb8e63FDN2thVYVkfMjaiZGk2xUR7G6tCXlSHeaIpR/MNPgvMWboD9zElWMx2eHV/LGbFJa28yjb/Mn4h9pt3VfuJZ9uucGI4CeIImlbcjHEKiqfqp0KDfegwXwBH3IYQ2wurMQSPfM/+SuS7i2b4trVN/hqE0xVZduVyoacHqBoKG2RNWrTRvEsDo7Vl7uERmr1yc1TFAglQ8LmcZQrDoL/7ff5lgtiRnkS50BY78fF9xbP6K2hntXphA6/3qAlDWgnXF6teg98ZdHXWigTzRXOoEN/gosvOIZv4BrNRHYCHiCR0EUVG5xHceg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=dQhEXpS/qk9SnX5ac0QHBS2zJbAZkBO/b3eNpLXKyI3BNNoDSt+yvJ5HhQ/IEIxqPFOul4o2jymu+je1rYohdn3Lkz8XgiQDeakhLOa7GV12dgSieVZO6htJ0OJop079oy37zXXGJbZ/0mWF4rBwJHj8zDkyz3E3WW8eEmG+ORNWQesrlQwrYy736NElCUswjNFpdd3Ssrsdks3t4WX5LONPHwXoHFiKlsJT2bOtiV3+Oz5p3u2gx4B8tQKzaabijlrM6/9/Us0K8mXITej1KZg5sjiUoezYTY2eFtz6plb4eoWa0uB7miCqLHnFpR06VoEYIENMmAeqBrCPCVfBxQ==
  • Authentication-results: suse.com; dkim=none (message not signed) header.d=none;suse.com; dmarc=none action=none header.from=epam.com;
  • Cc: Oleksandr Andrushchenko <andr2000@xxxxxxxxx>, "Rahul.Singh@xxxxxxx" <Rahul.Singh@xxxxxxx>, "Bertrand.Marquis@xxxxxxx" <Bertrand.Marquis@xxxxxxx>, "julien.grall@xxxxxxx" <julien.grall@xxxxxxx>, "sstabellini@xxxxxxxxxx" <sstabellini@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, "iwj@xxxxxxxxxxxxxx" <iwj@xxxxxxxxxxxxxx>, "wl@xxxxxxx" <wl@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Delivery-date: Fri, 13 Nov 2020 11:02:44 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHWtpbzyJg77SbpvkSRWLKjWy/3PanEQmgAgAA8YoCAABlOgIABCD8AgABBOQCAAAXIAIAAAS0AgAADNgA=
  • Thread-topic: [PATCH 06/10] vpci: Make every domain handle its own BARs

On 11/13/20 12:50 PM, Jan Beulich wrote:
> On 13.11.2020 11:46, Oleksandr Andrushchenko wrote:
>> On 11/13/20 12:25 PM, Jan Beulich wrote:
>>> On 13.11.2020 07:32, Oleksandr Andrushchenko wrote:
>>>> On 11/12/20 4:46 PM, Roger Pau Monné wrote:
>>>>> On Thu, Nov 12, 2020 at 01:16:10PM +0000, Oleksandr Andrushchenko wrote:
>>>>>> On 11/12/20 11:40 AM, Roger Pau Monné wrote:
>>>>>>> On Mon, Nov 09, 2020 at 02:50:27PM +0200, Oleksandr Andrushchenko wrote:
>>>>>>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@xxxxxxxx>
>>>>>>>> +static uint32_t bar_read_dispatch(const struct pci_dev *pdev, 
>>>>>>>> unsigned int reg,
>>>>>>>> +                                  void *data)
>>>>>>>> +{
>>>>>>>> +    struct vpci_bar *vbar, *bar = data;
>>>>>>>> +
>>>>>>>> +    if ( is_hardware_domain(current->domain) )
>>>>>>>> +        return bar_read_hwdom(pdev, reg, data);
>>>>>>>> +
>>>>>>>> +    vbar = get_vpci_bar(current->domain, pdev, bar->index);
>>>>>>>> +    if ( !vbar )
>>>>>>>> +        return ~0;
>>>>>>>> +
>>>>>>>> +    return bar_read_guest(pdev, reg, vbar);
>>>>>>>> +}
>>>>>>>> +
>>>>>>>> +static void bar_write_dispatch(const struct pci_dev *pdev, unsigned 
>>>>>>>> int reg,
>>>>>>>> +                               uint32_t val, void *data)
>>>>>>>> +{
>>>>>>>> +    struct vpci_bar *bar = data;
>>>>>>>> +
>>>>>>>> +    if ( is_hardware_domain(current->domain) )
>>>>>>>> +        bar_write_hwdom(pdev, reg, val, data);
>>>>>>>> +    else
>>>>>>>> +    {
>>>>>>>> +        struct vpci_bar *vbar = get_vpci_bar(current->domain, pdev, 
>>>>>>>> bar->index);
>>>>>>>> +
>>>>>>>> +        if ( !vbar )
>>>>>>>> +            return;
>>>>>>>> +        bar_write_guest(pdev, reg, val, vbar);
>>>>>>>> +    }
>>>>>>>> +}
>>>>>>> You should assign different handlers based on whether the domain that
>>>>>>> has the device assigned is a domU or the hardware domain, rather than
>>>>>>> doing the selection here.
>>>>>> Hm, handlers are assigned once in init_bars and this function is only 
>>>>>> called
>>>>>>
>>>>>> for hwdom, so there is no way I can do that for the guests. Hence, the 
>>>>>> dispatcher
>>>>> I think we might want to reset the vPCI handlers when a devices gets
>>>>> assigned and deassigned.
>>>> In ARM case init_bars is called too early: PCI device assignment is 
>>>> currently
>>>>
>>>> initiated by Domain-0' kernel and is done *before* PCI devices are given 
>>>> memory
>>>>
>>>> ranges and BARs assigned:
>>>>
>>>> [    0.429514] pci_bus 0000:00: root bus resource [bus 00-ff]
>>>> [    0.429532] pci_bus 0000:00: root bus resource [io 0x0000-0xfffff]
>>>> [    0.429555] pci_bus 0000:00: root bus resource [mem 
>>>> 0xfe200000-0xfe3fffff]
>>>> [    0.429575] pci_bus 0000:00: root bus resource [mem 
>>>> 0x30000000-0x37ffffff]
>>>> [    0.429604] pci_bus 0000:00: root bus resource [mem 
>>>> 0x38000000-0x3fffffff pref]
>>>> [    0.429670] pci 0000:00:00.0: enabling Extended Tags
>>>> [    0.453764] pci 0000:00:00.0: -------------------- BUS_NOTIFY_ADD_DEVICE
>>>>
>>>> < init_bars >
>>>>
>>>> [    0.453793] pci 0000:00:00.0: -- IRQ 0
>>>> [    0.458825] pci 0000:00:00.0: Failed to add - passthrough or MSI/MSI-X 
>>>> might fail!
>>>> [    0.471790] pci 0000:01:00.0: -------------------- BUS_NOTIFY_ADD_DEVICE
>>>>
>>>> < init_bars >
>>>>
>>>> [    0.471821] pci 0000:01:00.0: -- IRQ 255
>>>> [    0.476809] pci 0000:01:00.0: Failed to add - passthrough or MSI/MSI-X 
>>>> might fail!
>>>>
>>>> < BAR assignments below >
>>>>
>>>> [    0.488233] pci 0000:00:00.0: BAR 14: assigned [mem 
>>>> 0xfe200000-0xfe2fffff]
>>>> [    0.488265] pci 0000:00:00.0: BAR 15: assigned [mem 
>>>> 0x38000000-0x380fffff pref]
>>>>
>>>> In case of x86 this is pretty much ok as BARs are already in place, but 
>>>> for ARM we
>>>>
>>>> need to take care and re-setup vPCI BARs for hwdom.
>>> Even on x86 there's no guarantee that all devices have their BARs set
>>> up by firmware.
>> This is true. But there you could have config space trapped in "x86 generic 
>> way",
>>
>> please correct me if I'm wrong here
>>
>>> In a subsequent reply you've suggested to move init_bars from "add" to
>>> "assign", but I'm having trouble seeing what this would change: It's
>>> not Dom0 controlling assignment (to itself), but Xen assigns the device
>>> towards the end of pci_add_device().
>> PHYSDEVOP_pci_device_add vs XEN_DOMCTL_assign_device
>>
>> Currently we initialize BARs during PHYSDEVOP_pci_device_add and
>>
>> if we do that during XEN_DOMCTL_assign_device things seem to change
> But there can't possibly be any XEN_DOMCTL_assign_device involved in
> booting of Dom0.

Indeed. So, do you have an idea when we should call init_bars suitable

for both ARM and x86?

Another question is: what happens bad if x86 and ARM won't call init_bars

until the moment we really assign a PCI device to the first guest?

>
>>>>>     In order to do passthrough to domUs safely
>>>>> we will have to add more handlers than what's required for dom0,
>>>> Can you please tell what are thinking about? What are the missing handlers?
>>>>>     and
>>>>> having is_hardware_domain sprinkled in all of them is not a suitable
>>>>> solution.
>>>> I'll try to replace is_hardware_domain with something like:
>>>>
>>>> +/*
>>>> + * Detect whether physical PCI devices in this segment belong
>>>> + * to the domain given, e.g. on x86 all PCI devices live in hwdom,
>>>> + * but in case of ARM this might not be the case: those may also
>>>> + * live in driver domains or even Xen itself.
>>>> + */
>>>> +bool pci_is_hardware_domain(struct domain *d, u16 seg)
>>>> +{
>>>> +#ifdef CONFIG_X86
>>>> +    return is_hardware_domain(d);
>>>> +#elif CONFIG_ARM
>>>> +    return pci_is_owner_domain(d, seg);
>>>> +#else
>>>> +#error "Unsupported architecture"
>>>> +#endif
>>>> +}
>>>> +
>>>> +/*
>>>> + * Get domain which owns this segment: for x86 this is always hardware
>>>> + * domain and for ARM this can be different.
>>>> + */
>>>> +struct domain *pci_get_hardware_domain(u16 seg)
>>>> +{
>>>> +#ifdef CONFIG_X86
>>>> +    return hardware_domain;
>>>> +#elif CONFIG_ARM
>>>> +    return pci_get_owner_domain(seg);
>>>> +#else
>>>> +#error "Unsupported architecture"
>>>> +#endif
>>>> +}
>>>>
>>>> This is what I use to properly detect the domain that really owns physical 
>>>> host bridge
>>> I consider this problematic. We should try to not let Arm's and x86'es
>>> PCI implementations diverge too much, i.e. at least the underlying basic
>>> model would better be similar. For example, if entire segments can be
>>> assigned to a driver domain on Arm, why should the same not be possible
>>> on x86?
>> Good question, probably in this case x86 == ARM and I can use
>>
>> pci_is_owner_domain for both architectures instead of using 
>> is_hardware_domain for x86
>>
>>> Furthermore I'm suspicious about segments being the right granularity
>>> here. On ia64 multiple host bridges could (and typically would) live
>>> on segment 0. Iirc I had once seen output from an x86 system which was
>>> apparently laid out similarly. Therefore, just like for MCFG, I think
>>> the granularity wants to be bus ranges within a segment.
>> Can you please suggest something we can use as a hint for such a detection 
>> logic?
> The underlying information comes from ACPI tables, iirc. I don't
> recall the details, though - sorry.

Ok, so seg + bus should be enough for both ARM and Xen then, right?

pci_get_hardware_domain(u16 seg, u8 bus)

>
> Jan

Thank you,

Oleksandr

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.