[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4 02/11] vpci: cancel pending map/unmap on vpci removal



On 16.11.2021 15:24, Oleksandr Andrushchenko wrote:
> 
> 
> On 16.11.21 16:12, Jan Beulich wrote:
>> On 16.11.2021 14:41, Oleksandr Andrushchenko wrote:
>>>
>>> On 16.11.21 10:23, Oleksandr Andrushchenko wrote:
>>>> On 16.11.21 10:01, Jan Beulich wrote:
>>>>> On 16.11.2021 08:32, Oleksandr Andrushchenko wrote:
>>>>>> On 15.11.21 18:56, Jan Beulich wrote:
>>>>>>> On 05.11.2021 07:56, Oleksandr Andrushchenko wrote:
>>>>>>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@xxxxxxxx>
>>>>>>>>
>>>>>>>> When a vPCI is removed for a PCI device it is possible that we have
>>>>>>>> scheduled a delayed work for map/unmap operations for that device.
>>>>>>>> For example, the following scenario can illustrate the problem:
>>>>>>>>
>>>>>>>> pci_physdev_op
>>>>>>>>        pci_add_device
>>>>>>>>            init_bars -> modify_bars -> defer_map -> 
>>>>>>>> raise_softirq(SCHEDULE_SOFTIRQ)
>>>>>>>>        iommu_add_device <- FAILS
>>>>>>>>        vpci_remove_device -> xfree(pdev->vpci)
>>>>>>>>
>>>>>>>> leave_hypervisor_to_guest
>>>>>>>>        vpci_process_pending: v->vpci.mem != NULL; v->vpci.pdev->vpci 
>>>>>>>> == NULL
>>>>>>>>
>>>>>>>> For the hardware domain we continue execution as the worse that
>>>>>>>> could happen is that MMIO mappings are left in place when the
>>>>>>>> device has been deassigned
>>>>>>> Is continuing safe in this case? I.e. isn't there the risk of a NULL
>>>>>>> deref?
>>>>>> I think it is safe to continue
>>>>> And why do you think so? I.e. why is there no race for Dom0 when there
>>>>> is one for DomU?
>>>> Well, then we need to use a lock to synchronize the two.
>>>> I guess this needs to be pci devs lock unfortunately
>>> The parties involved in deferred work and its cancellation:
>>>
>>> MMIO trap -> vpci_write -> vpci_cmd_write -> modify_bars -> defer_map
>>>
>>> Arm: leave_hypervisor_to_guest -> check_for_vcpu_work -> 
>>> vpci_process_pending
>>>
>>> x86: two places -> hvm_do_resume -> vpci_process_pending
>>>
>>> So, both defer_map and vpci_process_pending need to be synchronized with
>>> pcidevs_{lock|unlock).
>> If I was an Arm maintainer, I'm afraid I would object to the pcidevs lock
>> getting used in leave_hypervisor_to_guest.
> I do agree this is really not good, but it seems I am limited in choices.
> @Stefano, @Julien, do you see any better way of doing that?
> 
> We were thinking about introducing a dedicated lock for vpci [1],
> but finally decided to use pcidevs_lock for now

Even that locking model might be too heavyweight for this purpose,
unless an r/w lock was intended. The problem would still be that
all guest exiting would be serialized within a domain. (That's still
better than serializing all guest exiting on the host, of course.)

Jan

> [1] https://lore.kernel.org/all/afe47397-a792-6b0c-0a89-b47c523e50d9@xxxxxxxx/
> 




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.