[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [XEN PATCH] tools/libs/light/libxl_pci.c: explicitly grant access to Intel IGD opregion
On 4/1/22 9:21 AM, Chuck Zmudzinski wrote: On 3/30/22 2:45 PM, Jason Andryuk wrote:On Fri, Mar 18, 2022 at 4:13 AM Jan Beulich <jbeulich@xxxxxxxx> wrote:On 14.03.2022 04:41, Chuck Zmudzinski wrote:When gfx_passthru is enabled for the Intel IGD, hvmloader maps the IGD opregion to the guest but libxl does not grant the guest permission toaccess the mapped memory region. This results in a crash of the i915.kokernel module in a Linux HVM guest when it needs to access the IGD opregion: Oct 23 11:36:33 domU kernel: Call Trace: Oct 23 11:36:33 domU kernel: ? idr_alloc+0x39/0x70 Oct 23 11:36:33 domU kernel: drm_get_last_vbltimestamp+0xaa/0xc0 [drm]Oct 23 11:36:33 domU kernel: drm_reset_vblank_timestamp+0x5b/0xd0 [drm]Oct 23 11:36:33 domU kernel: drm_crtc_vblank_on+0x7b/0x130 [drm]Oct 23 11:36:33 domU kernel: intel_modeset_setup_hw_state+0xbd4/0x1900 [i915]Oct 23 11:36:33 domU kernel: ? _cond_resched+0x16/0x40 Oct 23 11:36:33 domU kernel: ? ww_mutex_lock+0x15/0x80Oct 23 11:36:33 domU kernel: intel_modeset_init_nogem+0x867/0x1d30 [i915]Oct 23 11:36:33 domU kernel: ? gen6_write32+0x4b/0x1c0 [i915]Oct 23 11:36:33 domU kernel: ? intel_irq_postinstall+0xb9/0x670 [i915]Oct 23 11:36:33 domU kernel: i915_driver_probe+0x5c2/0xc90 [i915]Oct 23 11:36:33 domU kernel: ? vga_switcheroo_client_probe_defer+0x1f/0x40Oct 23 11:36:33 domU kernel: ? i915_pci_probe+0x3f/0x150 [i915] Oct 23 11:36:33 domU kernel: local_pci_probe+0x42/0x80 Oct 23 11:36:33 domU kernel: ? _cond_resched+0x16/0x40 Oct 23 11:36:33 domU kernel: pci_device_probe+0xfd/0x1b0 Oct 23 11:36:33 domU kernel: really_probe+0x222/0x480 Oct 23 11:36:33 domU kernel: driver_probe_device+0xe1/0x150 Oct 23 11:36:33 domU kernel: device_driver_attach+0xa1/0xb0 Oct 23 11:36:33 domU kernel: __driver_attach+0x8a/0x150 Oct 23 11:36:33 domU kernel: ? device_driver_attach+0xb0/0xb0 Oct 23 11:36:33 domU kernel: ? device_driver_attach+0xb0/0xb0 Oct 23 11:36:33 domU kernel: bus_for_each_dev+0x78/0xc0 Oct 23 11:36:33 domU kernel: bus_add_driver+0x12b/0x1e0 Oct 23 11:36:33 domU kernel: driver_register+0x8b/0xe0 Oct 23 11:36:33 domU kernel: ? 0xffffffffc06b8000 Oct 23 11:36:33 domU kernel: i915_init+0x5d/0x70 [i915] Oct 23 11:36:33 domU kernel: do_one_initcall+0x44/0x1d0 Oct 23 11:36:33 domU kernel: ? do_init_module+0x23/0x260 Oct 23 11:36:33 domU kernel: ? kmem_cache_alloc_trace+0xf5/0x200 Oct 23 11:36:33 domU kernel: do_init_module+0x5c/0x260 Oct 23 11:36:33 domU kernel: __do_sys_finit_module+0xb1/0x110 Oct 23 11:36:33 domU kernel: do_syscall_64+0x33/0x80 Oct 23 11:36:33 domU kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9The call trace alone leaves open where exactly the crash occurred. Looking at 5.17 I notice that the first thing the driver does after mapping the range it to check the signature (both in intel_opregion_setup()). As the signature can't possibly match with no access granted to the underlying mappings, there shouldn't be any further attempts to use the region in the driver; if there are, I'd view this as a driver bug.Yes. i915_driver_hw_probe does not check the return value of intel_opregion_setup(dev_priv) and just continues on. Chuck, the attached patch may help if you want to test it. Regards, JasonI tested the patch - it made no noticeable difference. Correction (sorry for the confusion): I didn't know I needed to replace more than just a re-built i915.ko module to enable the patch for testing. When I updated the entire Debian kernel package including all the modules and the kernel image with the patched kernel package, it made quite a difference. With Jason's patch, the three call traces just became a much shorter error message: Apr 05 20:46:18 debian kernel: xen: --> pirq=16 -> irq=24 (gsi=24)Apr 05 20:46:18 debian kernel: i915 0000:00:02.0: [drm] VT-d active for gfx access Apr 05 20:46:18 debian kernel: i915 0000:00:02.0: vgaarb: deactivate vga console Apr 05 20:46:18 debian kernel: Console: switching to colour dummy device 80x25 Apr 05 20:46:18 debian kernel: i915 0000:00:02.0: [drm] DMAR active, disabling use of stolen memory Apr 05 20:46:18 debian kernel: resource sanity check: requesting [mem 0xffffffff-0x100001ffe], which spans more than Reserved [mem 0xfdfff000-0xffffffff] Apr 05 20:46:18 debian kernel: caller memremap+0xeb/0x1c0 mapping multiple BARs Apr 05 20:46:18 debian kernel: i915 0000:00:02.0: Device initialization failed (-22) Apr 05 20:46:18 debian kernel: i915 0000:00:02.0: Please file a bug on drm/i915; see https://gitlab.freedesktop.org/drm/intel/-/wikis/How-to-file-i915-bugs for details. Apr 05 20:46:18 debian kernel: i915: probe of 0000:00:02.0 failed with error -22 --------------------- End of Kernel Error Log ---------------------- So I think the patch does propagate the error up the stack and bails out before producing the Call traces, and... I even had output after booting - the gdm3 Gnome display manager login page displayed, but when I tried to login to the Gnome desktop, the screen went dark and I could not even login to the headless Xen Dom0 control domain via ssh after that and I just used the reset button on the machine to reboot it, so the patch causes some trouble with the Dom0 when the guest cannot access the opregion. The patch works fine when the guest can access the opregion and in that case I was able to login to the Gnome session, but it caused quite a bit of trouble and apparently crashed the Dom0 or at least caused networking in the Dom0 to stop working when I tried to login to the Gnome session in the guest for the case when the guest cannot access the opregion. So I would not recommend Jason's patch as is for the Linux kernel. The main reason is that it looks like it is working at first with a login screen displayed, but when a user tries to login, the whole system crashes. Regards, Chuck
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |