[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] System freeze with IGD passthrough



On Thu, Dec 20, 2012 at 12:04:01AM +0800, G.R. wrote:
> On Wed, Dec 19, 2012 at 2:20 PM, G.R. <firemeteor@xxxxxxxxxxxxxxxxxxxxx> 
> wrote:
> > Adding Jean, the author to the opregion patch.
> >
> > Jean, I believe the warning is due to the offset within the page.
> > To accommodate the offset, you would need to reserve another page for it.
> > Will the extra page cause any unexpected problem?
> >
> > The original thread is about an instability issue that directly freeze the 
> > host.
> > I believe this warning above should not has such effect.
> > What do you think? And any suggestion?
> >
> 
> Jean appears to be no longer reach able.
> The warning I found turns out to be not relevant.
> According to the OpRegion spec, the tail part is reserved and should
> never be touched by the guest.
> But anyway, I had a local fix to get rid of the warning, but reserving
> one more page and map it when the host opregion is not page aligned.
> I'll send it to a separate thread.
> 
> Back to the topic. I updated to xen 4.2.1 and tried three times tonight.
> Two of them lead to total freeze with no error log available, after
> game playing for a couple of minutes.
> And the last try ended up with GPU hang after 10+ minutes of game playing.
> This is a guest only hang. But I still have no way to check GPU error
> state even it has been collected:
> 
> [ 1553.588076] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer
> elapsed... GPU hung
> [ 1553.592112] [drm] capturing error event; look for more information
> in /debug/dri/0/i915_error_state
> [ 1582.004075] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer
> elapsed... GPU hung
> [ 1597.220075] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer
> elapsed... GPU hung
> [ 1613.220074] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer
> elapsed... GPU hung

Those also appear with baremetal (Linus actually mentioned this).

> 
> I'm wondering if the two syndromes are due to the same underlying cause.
> But I guess a GPU hang caused by guest driver issue should not freeze
> the host. Is it true?

It shouldn't. Is the machine usuable with this guest being frozen?
> 
> I'm going to try more with different config -- different kernel
> version, with / without PVOPS, native run vs VM etc.
> But this is kind of blindly since I have no clue at all. If you have
> anything to suspect, it will be highly appreciated.
> 
> Thanks,
> Timothy
> 
> > Thanks,
> > Timothy
> >
> > On Wed, Dec 19, 2012 at 1:28 AM, G.R. <firemeteor@xxxxxxxxxxxxxxxxxxxxx> 
> > wrote:
> >> Hi Stefano,
> >>
> >> I recently tried to play some 3D games on my linux guest.
> >> The game starts without problem but it freezes the entire system after
> >> a some time (a minute or so?).
> >> Here I mean both the host and domU are not responsive anymore.
> >> The ssh freezes and i had to shutdown the machine using power button 
> >> directly.
> >>
> >> I did not find anything obvious from the host log. But from the guest,
> >> I can find this:
> >>
> >> Dec 18 20:28:38 debvm kernel: [    0.899860] resource map sanity check
> >> conflict: 0xfeff5018 0xfeff7017 0xfeff7000 0xffffffff reserved
> >> Dec 18 20:28:38 debvm kernel: [    0.899862] ------------[ cut here
> >> ]------------
> >> Dec 18 20:28:38 debvm kernel: [    0.899869] WARNING: at
> >> arch/x86/mm/ioremap.c:171 __ioremap_caller+0x2c4/0x33c()
> >> Dec 18 20:28:38 debvm kernel: [    0.899870] Hardware name: HVM domU
> >> Dec 18 20:28:38 debvm kernel: [    0.899872] Info: mapping multiple
> >> BARs. Your kernel is fine.
> >> Dec 18 20:28:38 debvm kernel: [    0.899873] Modules linked in:
> >> Dec 18 20:28:38 debvm kernel: [    0.899878] Pid: 1, comm: swapper/0
> >> Not tainted 3.6.9 #4
> >> Dec 18 20:28:38 debvm kernel: [    0.899892] Call Trace:
> >> Dec 18 20:28:38 debvm kernel: [    0.899896]  [<ffffffff8103d194>] ?
> >> warn_slowpath_common+0x76/0x8a
> >> Dec 18 20:28:38 debvm kernel: [    0.899898]  [<ffffffff8103d240>] ?
> >> warn_slowpath_fmt+0x45/0x4a
> >> Dec 18 20:28:38 debvm kernel: [    0.899900]  [<ffffffff81032a6c>] ?
> >> __ioremap_caller+0x2c4/0x33c
> >> Dec 18 20:28:38 debvm kernel: [    0.899902]  [<ffffffff812c3be3>] ?
> >> intel_opregion_setup+0x9c/0x201
> >> Dec 18 20:28:38 debvm kernel: [    0.899904]  [<ffffffff812bcb75>] ?
> >> intel_setup_gmbus+0x175/0x19d
> >> Dec 18 20:28:38 debvm kernel: [    0.899907]  [<ffffffff8128a37a>] ?
> >> i915_driver_load+0x548/0x90d
> >> Dec 18 20:28:38 debvm kernel: [    0.899910]  [<ffffffff812ff804>] ?
> >> setup_hpet_msi_remapped+0x20/0x20
> >> Dec 18 20:28:38 debvm kernel: [    0.899912]  [<ffffffff81272706>] ?
> >> drm_get_pci_dev+0x152/0x259
> >> Dec 18 20:28:38 debvm kernel: [    0.899915]  [<ffffffff813d4883>] ?
> >> _raw_spin_lock_irqsave+0x21/0x45
> >> Dec 18 20:28:38 debvm kernel: [    0.899918]  [<ffffffff811d9ecc>] ?
> >> local_pci_probe+0x5a/0xa0
> >> Dec 18 20:28:38 debvm kernel: [    0.899920]  [<ffffffff811d9fcf>] ?
> >> pci_device_probe+0xbd/0xe7
> >> Dec 18 20:28:38 debvm kernel: [    0.899922]  [<ffffffff812cd887>] ?
> >> driver_probe_device+0x1b0/0x1b0
> >> Dec 18 20:28:38 debvm kernel: [    0.899923]  [<ffffffff812cd887>] ?
> >> driver_probe_device+0x1b0/0x1b0
> >> Dec 18 20:28:38 debvm kernel: [    0.899925]  [<ffffffff812cd769>] ?
> >> driver_probe_device+0x92/0x1b0
> >> Dec 18 20:28:38 debvm kernel: [    0.899926]  [<ffffffff812cd8da>] ?
> >> __driver_attach+0x53/0x73
> >> Dec 18 20:28:38 debvm kernel: [    0.899928]  [<ffffffff812cc06f>] ?
> >> bus_for_each_dev+0x46/0x77
> >> Dec 18 20:28:38 debvm kernel: [    0.899930]  [<ffffffff812ccf8f>] ?
> >> bus_add_driver+0xd5/0x1f4
> >> Dec 18 20:28:38 debvm kernel: [    0.899931]  [<ffffffff812cde14>] ?
> >> driver_register+0x89/0x101
> >> Dec 18 20:28:38 debvm kernel: [    0.899933]  [<ffffffff811d9336>] ?
> >> __pci_register_driver+0x49/0xa3
> >> Dec 18 20:28:38 debvm kernel: [    0.899935]  [<ffffffff816d55c7>] ?
> >> ttm_init+0x63/0x63
> >> Dec 18 20:28:38 debvm kernel: [    0.899937]  [<ffffffff81002085>] ?
> >> do_one_initcall+0x75/0x12c
> >> Dec 18 20:28:38 debvm kernel: [    0.899940]  [<ffffffff816a6cc2>] ?
> >> kernel_init+0x13c/0x1c0
> >> Dec 18 20:28:38 debvm kernel: [    0.899941]  [<ffffffff816a6565>] ?
> >> do_early_param+0x83/0x83
> >> Dec 18 20:28:38 debvm kernel: [    0.899943]  [<ffffffff813d9f44>] ?
> >> kernel_thread_helper+0x4/0x10
> >> Dec 18 20:28:38 debvm kernel: [    0.899945]  [<ffffffff816a6b86>] ?
> >> start_kernel+0x3e1/0x3e1
> >> Dec 18 20:28:38 debvm kernel: [    0.899947]  [<ffffffff813d9f40>] ?
> >> gs_change+0x13/0x13
> >> Dec 18 20:28:38 debvm kernel: [    0.899950] ---[ end trace
> >> db461543ce599b44 ]---
> >>
> >> I'm not sure if this has anything to do with the freeze. This seems to
> >> show up on every boot after I upgraded to xen version 4.2.1-rc2. Both
> >> debian kernel 3.2.32 / 3.6.9 suffers from the same log. But whole
> >> system freeze happens only during gaming, which is much less frequent.
> >> So I'm not sure if the two are related. But anyway, could you comment
> >> about what does this log mean?
> >>
> >> I can find the one of the mentioned address in the qemu_dm log:
> >> pt_pci_write_config: [00:02:0] address=00fc val=0xfeff5000 len=4
> >> igd_write_opregion: Map OpRegion: cd996018 -> feff5018
> >> igd_write_opregion: [00:02:0] addr=fc len=2 val=feff5000
> >>
> >> PS: I also run xbmc on domU and it playbacks video under HW
> >> acceleration (VAAPI) without any problem. XBMC by itself is also an
> >> graphics intensive program. But this runs on an pure HVM guest, while
> >> the failing case is on PVHVM.
> >>
> >> PS2: I also suffered another instability yesterday. It happens when I
> >> was compiling kernel in side the domU. The host reboots suddenly.
> >> Since I'm not using graphics at that time (Xorg session is idle, I
> >> connected through SSH), this may be a different issue.
> >>
> >> Thanks,
> >> Timothy
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxx
> http://lists.xen.org/xen-devel
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.