[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v4 02/11] vpci: cancel pending map/unmap on vpci removal
On 18.11.2021 08:49, Oleksandr Andrushchenko wrote: > > > On 17.11.21 10:28, Jan Beulich wrote: >> On 05.11.2021 07:56, Oleksandr Andrushchenko wrote: >>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@xxxxxxxx> >>> >>> When a vPCI is removed for a PCI device it is possible that we have >>> scheduled a delayed work for map/unmap operations for that device. >>> For example, the following scenario can illustrate the problem: >>> >>> pci_physdev_op >>> pci_add_device >>> init_bars -> modify_bars -> defer_map -> >>> raise_softirq(SCHEDULE_SOFTIRQ) >>> iommu_add_device <- FAILS >>> vpci_remove_device -> xfree(pdev->vpci) >>> >>> leave_hypervisor_to_guest >>> vpci_process_pending: v->vpci.mem != NULL; v->vpci.pdev->vpci == NULL >>> >>> For the hardware domain we continue execution as the worse that >>> could happen is that MMIO mappings are left in place when the >>> device has been deassigned >>> >>> For unprivileged domains that get a failure in the middle of a vPCI >>> {un}map operation we need to destroy them, as we don't know in which >>> state the p2m is. This can only happen in vpci_process_pending for >>> DomUs as they won't be allowed to call pci_add_device. >>> >>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@xxxxxxxx> >> Thinking about it some more, I'm not convinced any of this is really >> needed in the presented form. > The intention of this patch was to handle error conditions which are > abnormal, e.g. when iommu_add_device fails and we are in the middle > of initialization. So, I am trying to cancel all pending work which might > already be there and not to crash. Only Dom0 may be able to prematurely access the device during "add". Yet unlike for DomU-s we generally expect Dom0 to be well-behaved. Hence I'm not sure I see the need for dealing with these. >> Removal of a vPCI device is the analogue >> of hot-unplug on baremetal. That's not a "behind the backs of >> everything" operation. Instead the host admin has to prepare the >> device for removal, which will result in it being quiescent (which in >> particular means no BAR adjustments anymore). The act of removing the >> device from the system has as its virtual counterpart "xl pci-detach". >> I think it ought to be in this context when pending requests get >> drained, and an indicator be set that no further changes to that >> device are permitted. This would mean invoking from >> vpci_deassign_device() as added by patch 4, not from >> vpci_remove_device(). This would yield removal of a device from the >> host being independent of removal of a device from a guest. >> >> The need for vpci_remove_device() seems questionable in the first >> place: Even for hot-unplug on the host it may be better to require a >> pci-detach from (PVH) Dom0 before the actual device removal. This >> would involve an adjustment to the de-assignment logic for the case >> of no quarantining: We'd need to make sure explicit de-assignment >> from Dom0 actually removes the device from there; right now >> de-assignment assumes "from DomU" and "to Dom0 or DomIO" (depending >> on quarantining mode). As to this, I meanwhile think that add/remove can very well have Dom0 related vPCI init/teardown. But for DomU all of that should happen during assign/de-assign. A device still assigned to a DomU simply should never be subject to physical hot-unplug in the first place. Jan
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |