[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

i915 "GPU HANG", bisected to a2daa27c0c61 "swiotlb: simplify swiotlb_max_segment"



Hi,

Since 5.19, I observe severe glitches (mostly horizontal black stripes, but
not only) when using IGD in Xen PV dom0. After not very long time Xorg
crashes, and dmesg contain messages like this:

    i915 0000:00:02.0: [drm] GPU HANG: ecode 7:1:01fffbfe, in Xorg [5337]
    i915 0000:00:02.0: [drm] Resetting rcs0 for stopped heartbeat on rcs0
    i915 0000:00:02.0: [drm] Xorg[5337] context reset due to GPU hang

The issue can be observed on several different hardware (at least Ivy
Bridge, Tiger Lake and Kaby Lake). It doesn't always happen immediately,
sometimes I need to start several VMs first.
Example how it looks like:
https://openqa.qubes-os.org/tests/48187#step/qui_widgets_notifications/8

More screenshots and logs are linked at 
https://github.com/QubesOS/qubes-issues/issues/7813

I managed to git bisect the issue and ended up with this as the first
bad commit:

    commit a2daa27c0c6137481226aee5b3136e453c642929
    Author: Christoph Hellwig <hch@xxxxxx>
    Date:   Mon Feb 14 11:44:42 2022 +0100

        swiotlb: simplify swiotlb_max_segment
        
        Remove the bogus Xen override that was usually larger than the actual
        size and just calculate the value on demand.  Note that
        swiotlb_max_segment still doesn't make sense as an interface and should
        eventually be removed.
        
        Signed-off-by: Christoph Hellwig <hch@xxxxxx>
        Reviewed-by: Anshuman Khandual <anshuman.khandual@xxxxxxx>
        Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
        Tested-by: Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>

I tried reverting just this commit on top of 6.0.x, but the context
changed significantly in subsequent commits, so after trying reverting
it together with 3 or 4 more commits I gave up.

What may be an important detail, the system heavily uses cross-VM shared
memory (gntdev) to map window contents from VMs. This is Qubes OS, and
it uses Xen 4.14.


-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

Attachment: signature.asc
Description: PGP signature


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.