[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] IOMMU: improve the FLR logic and move it fromhypervisor to Control Panel?
Hi, The term 'Control Panel' is rather unfamiliar to me. Does it mean qemu-dm for HVM guests? I think pciback in dom0 kernel would be the right place to do FLR, because it commonly used as the holder of pass-through pci device for both PV and HVM guests. The drawback of this is that communication between pciback and dom0 userspace tools may become complicated. But in general, it seems good to let dom0 kernel control pci devices. Regards, -- Yosuke Cui, Dexuan wrote: > Hi, Keir and all > Do you think the improvement to the FLR logic is OK? And moving it to Control > Panel? > I'm going to make a patch based on this. > > Thanks, > -- Dexuan > > > -----Original Message----- > From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx > [mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of Cui, Dexuan > Sent: 2008年6月19日 13:14 > To: Keir Fraser; xen-devel@xxxxxxxxxxxxxxxxxxx > Subject: [Xen-devel] IOMMU: improve the FLR logic and move it fromhypervisor > to Control Panel? > > Currently, when creating/destroying hvm guest with assigned devices, we > perform FLR for the devices in hypervisor: > xen/drivers/passthrough/vtd/utils.c: pdev_flr(). > The logic is: > a) if the device is PCI-e endpoint and it supports FLR, use that; > b) for other cases, we use D3hot/D0 transition for FLR. > > There are some issues: > > 1) looks there are few PCIe devices supporting FLR now. So currently, > almost all the PCIe devices and all PCI devices use the D3hot/D0 method. > However, actually, Dstate transition is not guaranteed to properly > clear the device state; > > 2) in case a), the current implementation is actually buggy: > Transaction_Pending_bit==0 doesn't mean the completion of FLR, just > means a way to ensure there is no pending transaction when we're going > to issue FLR (so we can be sure there is no data corruption). > And according to PCIe spec, after issuing FLR, we should wait at least > 100ms, but "mdelay(100)" is not acceptable in Xen... > > To resolve the issues, I propose to change the FLR logic to: > > 1) If the device is PCIe endpoint and supports PCIe FLR, use that; > 2) Else, if the device is PCIe endpoint, and all functions on the device > are assigned to the same guest, we use the immediate parent bus's > "Secondary Bus Reset" to reset all functions of the device (here, > actually we require all the functions of the device be assigned to the > same guest); > 3) Else, if the device is PCI endpoint and is on a host bus (e.g. > integrated devices), and if the device supports PCI "Advanced > Capabilities", we use that for FLR; > 4) Else, if the device is a vendor integrated PCI device with "known" > set of vendor/device id, we use the vendor-defined method of issuing > FLR. For instance, for the VendorID=0x8086, we can use the method > defined in Intel ICH9 Datasheet to perform FLR; > 5) Else, we use the" Secondary Bus Reset" (we ensure all the PCI devices > behind a bridge must be assigned to the same guest). > > And I propose to move the FLR logic to Control Panel. > The benefits are: > 1) It's natural, and makes the hypervisor thin; > 2) The 100ms-delay can be implemented easily in Control Panel, but not > easily in hypervisor; > 3) Some logic, like the lookup of a device's BDF to its parent's BDF can > be done more easily in Control Panel. > > Comments are appreciated. > > Thanks, > -- Dexuan > > _______________________________________________ > Xen-devel mailing list > Xen-devel@xxxxxxxxxxxxxxxxxxx > http://lists.xensource.com/xen-devel > > _______________________________________________ > Xen-devel mailing list > Xen-devel@xxxxxxxxxxxxxxxxxxx > http://lists.xensource.com/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |