[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v4 02/11] vpci: cancel pending map/unmap on vpci removal
Hi, Julien! On 16.11.21 20:02, Julien Grall wrote: > Hi Oleksandr, > > On 16/11/2021 14:24, Oleksandr Andrushchenko wrote: >> >> >> On 16.11.21 16:12, Jan Beulich wrote: >>> On 16.11.2021 14:41, Oleksandr Andrushchenko wrote: >>>> >>>> On 16.11.21 10:23, Oleksandr Andrushchenko wrote: >>>>> On 16.11.21 10:01, Jan Beulich wrote: >>>>>> On 16.11.2021 08:32, Oleksandr Andrushchenko wrote: >>>>>>> On 15.11.21 18:56, Jan Beulich wrote: >>>>>>>> On 05.11.2021 07:56, Oleksandr Andrushchenko wrote: >>>>>>>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@xxxxxxxx> >>>>>>>>> >>>>>>>>> When a vPCI is removed for a PCI device it is possible that we have >>>>>>>>> scheduled a delayed work for map/unmap operations for that device. >>>>>>>>> For example, the following scenario can illustrate the problem: >>>>>>>>> >>>>>>>>> pci_physdev_op >>>>>>>>> pci_add_device >>>>>>>>> init_bars -> modify_bars -> defer_map -> >>>>>>>>> raise_softirq(SCHEDULE_SOFTIRQ) >>>>>>>>> iommu_add_device <- FAILS >>>>>>>>> vpci_remove_device -> xfree(pdev->vpci) >>>>>>>>> >>>>>>>>> leave_hypervisor_to_guest >>>>>>>>> vpci_process_pending: v->vpci.mem != NULL; v->vpci.pdev->vpci >>>>>>>>> == NULL >>>>>>>>> >>>>>>>>> For the hardware domain we continue execution as the worse that >>>>>>>>> could happen is that MMIO mappings are left in place when the >>>>>>>>> device has been deassigned >>>>>>>> Is continuing safe in this case? I.e. isn't there the risk of a NULL >>>>>>>> deref? >>>>>>> I think it is safe to continue >>>>>> And why do you think so? I.e. why is there no race for Dom0 when there >>>>>> is one for DomU? >>>>> Well, then we need to use a lock to synchronize the two. >>>>> I guess this needs to be pci devs lock unfortunately >>>> The parties involved in deferred work and its cancellation: >>>> >>>> MMIO trap -> vpci_write -> vpci_cmd_write -> modify_bars -> defer_map >>>> >>>> Arm: leave_hypervisor_to_guest -> check_for_vcpu_work -> >>>> vpci_process_pending >>>> >>>> x86: two places -> hvm_do_resume -> vpci_process_pending >>>> >>>> So, both defer_map and vpci_process_pending need to be synchronized with >>>> pcidevs_{lock|unlock). >>> If I was an Arm maintainer, I'm afraid I would object to the pcidevs lock >>> getting used in leave_hypervisor_to_guest. >> I do agree this is really not good, but it seems I am limited in choices. >> @Stefano, @Julien, do you see any better way of doing that? > > I agree with Jan about using the pcidevs_{lock|unlock}. The lock is not > fine-grained enough for been call in vpci_process_pending(). > > I haven't yet looked at the rest of the series to be able to suggest the > exact lock. But we at least want a per-domain spinlock. > >> >> We were thinking about introducing a dedicated lock for vpci [1], >> but finally decided to use pcidevs_lock for now > > Skimming through the thread, you decided to use pcidevs_lock because it was > simpler and sufficient for the use case discussed back then. Now, we have a > use case where it would be a problem to use pcidevs_lock. So I think the > extra complexity is justified. I would like to understand what is this lock so I can implement that properly. We have the following options as I can see: 1. pcidevs_{lock|unlock} - considered too heavy, per host 2. pdev->vpci->lock - better, but still heavy, per PCI device 3. We may convert pdev->vpci->lock into r/w lock 4. We may introduce a specific lock To better understand the scope of the lock: 1. MMIO trap handlers (vpci_{read|write} - already protected with pdev->vpci->lock 2. vpci_process_pending (SOFTIRQ context) 3. Hypercalls which call pci_{add|remove|assign|deassign}_device 4. @Roger, did I miss something? And I feel that this needs a dedicated patch for that: I am not sure it is a good idea to add this locking change into this patch which seems not so relevant > > Cheers, >
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |