[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [XEN PATCH] tools/libs/light/libxl_pci.c: explicitly grant access to Intel IGD opregion
On 3/30/22 1:15 PM, Anthony PERARD wrote: Hi Chuck, On Sun, Mar 13, 2022 at 11:41:37PM -0400, Chuck Zmudzinski wrote:When gfx_passthru is enabled for the Intel IGD, hvmloader maps the IGD opregion to the guest but libxl does not grant the guest permission toI'm not reading the same thing when looking at code in hvmloader. It seems that hvmloader allocate some memory for the IGD opregion rather than mapping it. tools/firmware/hvmloader/pci.c:184 if ( vendor_id == 0x8086 ) { igd_opregion_pgbase = mem_hole_alloc(IGD_OPREGION_PAGES); /* * Write the the OpRegion offset to give the opregion * address to the device model. The device model will trap * and map the OpRegion at the give address. */ pci_writel(vga_devfn, PCI_INTEL_OPREGION, igd_opregion_pgbase << PAGE_SHIFT); } I think this write would go through QEMU, does it do something with it? AFAICT Qemu does do something with it: In the upstream Qemu, at hw/xen/xen_pt_config_init.c and in hw/xen/xen_pt_graphics.c, we see where Qemu implements functions to read and write the OpRegion, and that means it reads and writes the value for the mapped OpRegion address that is passed to it from hvmloader. It is a 32-bit address that is stored in the config attribute in sysfs for the Intel IGD on Linux guests. I have examined the config attribute of the Intel IGD in the control domain (dom0) and in the Linux guest and what I see is that in both dom0 and in the guest, the address of the IGD OpRegion is consistent with custom logs I have examined from xen/common/domctl.c that show the same machine and guest address for the OpRegion that the config attribute has for the machine and the guest. (I kind of replying to this question at the end of the mail.) Is this code in hvmloader actually run in your case? I admit I have not verified with certainty that Qemu and the guest is getting the OpRegion address from hvmloader. I will do a test to verify it, such as removing the code from hvmloader that writes the address in the pci config attribute and see if that prevents the guest from finding the address where the OpRegion is mapped to in the guest. That would prove that hvmloader code here is run in my case. Currently, because of another bug in Qemu upstream, this crash can only be reproduced using the traditional Qemu device model, and ofqemu-traditional is a bit old... What is the bug in QEMU? Do you have a link to a patch/mail? Not yet. I am working on it now. The bug is related to the passthrough of the IRQ to the guest. So far I have compared the logs in the Linux guest using Qemu upstream with Qemu traditional, and I have found that with the upstream Qemu, the Linux kernel in the guest reports that it cannot obtain IRQ 24: Mar 29 18:31:39 debian kernel: xen: --> pirq=16 -> irq=24 (gsi=24)Mar 29 18:31:39 debian kernel: i915 0000:00:02.0: [drm] VT-d active for gfx access ... Mar 29 18:31:39 debian kernel: xen:events: Failed to obtain physical IRQ 24 When using the traditional device model, this failure is not reported. The failure of the system to handle the IRQ prevents the guest from booting normally with the upstream Qemu. Comparing Qemu upstream code with Qemu traditional code, I notice that when the traditional Qemu sets up the pci-isa bridge at slot 1f, it configures an IRQ, but the upstream Qemu does not, so I suspect that is where the problem is, but I have not found the fix yet. It is well known that the pci-isa bridge at slot 1f needs to be specially configured for the Intel IGD to function properly. diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c index 4bbbfe9f16..a4fc473de9 100644 --- a/tools/libs/light/libxl_pci.c +++ b/tools/libs/light/libxl_pci.c @@ -2531,6 +2572,37 @@ int libxl__grant_vga_iomem_permission(libxl__gc *gc, const uint32_t domid, domid, vga_iomem_start, (vga_iomem_start + 0x20 - 1)); return ret; } + + /* If this is an Intel IGD, allow access to the IGD opregion */ + if (!libxl__is_igd_vga_passthru(gc, d_config)) return 0; + + uint32_t igd_opregion = sysfs_dev_get_igd_opregion(gc, pci); + uint32_t error = 0xffffffff; + if (igd_opregion == error) break; + + vga_iomem_start = ( (uint64_t) igd_opregion ) >> XC_PAGE_SHIFT; + ret = xc_domain_iomem_permission(CTX->xch, stubdom_domid, + vga_iomem_start, + IGD_OPREGION_PAGES, 1); + if (ret < 0) { + LOGED(ERROR, domid, + "failed to give stubdom%d access to iomem range " + "%"PRIx64"-%"PRIx64" for IGD passthru", + stubdom_domid, vga_iomem_start, (vga_iomem_start + + IGD_OPREGION_PAGES - 1)); + return ret; + } + ret = xc_domain_iomem_permission(CTX->xch, domid, + vga_iomem_start, + IGD_OPREGION_PAGES, 1);Here, you seems to add permission to an address that is read from the pci config space of the device, but as I've pointed above hvmloader seems to overwrite this address. I do not think hvmloader overwrites this address. I think of it as hvmloader passing the mapped address to the device model and guest, but as I said above, I will do more tests to verify with certainty what hvmloader is actually doing. It this call to xc_domain_iomem_permission() does actually anything useful? I am certain this call to xc_domain_iomem_permission() is necessary with the Qemu traditional device model to obtain permission for the guest (domid) to access the OpRegion. Without it, the bug is not fixed and I get the crash in the i915 kernel module in the Linux guest and a dark screen instead of a guest with output to the screen. Moreover, I have verified with custom logs from xen/common/domctl.c that access to the opregion is denied to domid, but not to dom0, when this patch is not applied. Or is it by luck that the address returned by sysfs_dev_get_igd_opregion() happened to be the address that hvmloader is going to write? Or maybe hvmloader doesn't actually do anything? I mentioned earlier my plans to verify that hvmloader does provide the device model and the guest with the mapped address of the Intel IGD OpRegion. Some more though on that, looking at QEMU, it seems that there's already a call to xc_domain_iomem_permission(), in igd_write_opregion(). So adding one in libxl would seems redundant, You may be right that with Qemu upstream it is not needed. I will be able to check it once I have a patch for the IRQ problem in upstream Qemu. When I wrote this patch a couple of weeks ago, though, I did not yet know that Qemu upstream also calls xc_domain_iomem_permission() and since Qemu upstream seems to obtain the necessary permission, the call here to xc_domain_iomem_permission() can be done only when the device model is Qemu traditional. or maybe it the one for the device model's domain that's needed (named 'stubdom_domid' here)? Well, I am not using a device model stub domain but running the device model in dom0, and my tests indicate it is not necessary to obtain permission for dom0, but I do admit I need to do more tests before submitting the next version of a patch. I plan to do some tests with a device model stub domain and see what configurations require permission for the stub domain. I expect it will only be needed when the device model is Qemu traditional because Qemu upstream obtains the necessary permission. I have no experience running the device model in a stub domain, and IIRC the Xen wiki explains that the supported configuration from the Xen Project is with the traditional device model and mini os in the stub domain, and others, such as Qubes OS, have done some work on the upstream Qemu and Linux running in a stub domain. Please correct me if this is not correct. Thank you for taking the time to look at this patch. Chuck
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |