[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] System freeze with IGD passthrough
On Thu, Dec 20, 2012 at 12:04:01AM +0800, G.R. wrote: > On Wed, Dec 19, 2012 at 2:20 PM, G.R. <firemeteor@xxxxxxxxxxxxxxxxxxxxx> > wrote: > > Adding Jean, the author to the opregion patch. > > > > Jean, I believe the warning is due to the offset within the page. > > To accommodate the offset, you would need to reserve another page for it. > > Will the extra page cause any unexpected problem? > > > > The original thread is about an instability issue that directly freeze the > > host. > > I believe this warning above should not has such effect. > > What do you think? And any suggestion? > > > > Jean appears to be no longer reach able. > The warning I found turns out to be not relevant. > According to the OpRegion spec, the tail part is reserved and should > never be touched by the guest. > But anyway, I had a local fix to get rid of the warning, but reserving > one more page and map it when the host opregion is not page aligned. > I'll send it to a separate thread. > > Back to the topic. I updated to xen 4.2.1 and tried three times tonight. > Two of them lead to total freeze with no error log available, after > game playing for a couple of minutes. > And the last try ended up with GPU hang after 10+ minutes of game playing. > This is a guest only hang. But I still have no way to check GPU error > state even it has been collected: > > [ 1553.588076] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer > elapsed... GPU hung > [ 1553.592112] [drm] capturing error event; look for more information > in /debug/dri/0/i915_error_state > [ 1582.004075] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer > elapsed... GPU hung > [ 1597.220075] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer > elapsed... GPU hung > [ 1613.220074] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer > elapsed... GPU hung Those also appear with baremetal (Linus actually mentioned this). > > I'm wondering if the two syndromes are due to the same underlying cause. > But I guess a GPU hang caused by guest driver issue should not freeze > the host. Is it true? It shouldn't. Is the machine usuable with this guest being frozen? > > I'm going to try more with different config -- different kernel > version, with / without PVOPS, native run vs VM etc. > But this is kind of blindly since I have no clue at all. If you have > anything to suspect, it will be highly appreciated. > > Thanks, > Timothy > > > Thanks, > > Timothy > > > > On Wed, Dec 19, 2012 at 1:28 AM, G.R. <firemeteor@xxxxxxxxxxxxxxxxxxxxx> > > wrote: > >> Hi Stefano, > >> > >> I recently tried to play some 3D games on my linux guest. > >> The game starts without problem but it freezes the entire system after > >> a some time (a minute or so?). > >> Here I mean both the host and domU are not responsive anymore. > >> The ssh freezes and i had to shutdown the machine using power button > >> directly. > >> > >> I did not find anything obvious from the host log. But from the guest, > >> I can find this: > >> > >> Dec 18 20:28:38 debvm kernel: [ 0.899860] resource map sanity check > >> conflict: 0xfeff5018 0xfeff7017 0xfeff7000 0xffffffff reserved > >> Dec 18 20:28:38 debvm kernel: [ 0.899862] ------------[ cut here > >> ]------------ > >> Dec 18 20:28:38 debvm kernel: [ 0.899869] WARNING: at > >> arch/x86/mm/ioremap.c:171 __ioremap_caller+0x2c4/0x33c() > >> Dec 18 20:28:38 debvm kernel: [ 0.899870] Hardware name: HVM domU > >> Dec 18 20:28:38 debvm kernel: [ 0.899872] Info: mapping multiple > >> BARs. Your kernel is fine. > >> Dec 18 20:28:38 debvm kernel: [ 0.899873] Modules linked in: > >> Dec 18 20:28:38 debvm kernel: [ 0.899878] Pid: 1, comm: swapper/0 > >> Not tainted 3.6.9 #4 > >> Dec 18 20:28:38 debvm kernel: [ 0.899892] Call Trace: > >> Dec 18 20:28:38 debvm kernel: [ 0.899896] [<ffffffff8103d194>] ? > >> warn_slowpath_common+0x76/0x8a > >> Dec 18 20:28:38 debvm kernel: [ 0.899898] [<ffffffff8103d240>] ? > >> warn_slowpath_fmt+0x45/0x4a > >> Dec 18 20:28:38 debvm kernel: [ 0.899900] [<ffffffff81032a6c>] ? > >> __ioremap_caller+0x2c4/0x33c > >> Dec 18 20:28:38 debvm kernel: [ 0.899902] [<ffffffff812c3be3>] ? > >> intel_opregion_setup+0x9c/0x201 > >> Dec 18 20:28:38 debvm kernel: [ 0.899904] [<ffffffff812bcb75>] ? > >> intel_setup_gmbus+0x175/0x19d > >> Dec 18 20:28:38 debvm kernel: [ 0.899907] [<ffffffff8128a37a>] ? > >> i915_driver_load+0x548/0x90d > >> Dec 18 20:28:38 debvm kernel: [ 0.899910] [<ffffffff812ff804>] ? > >> setup_hpet_msi_remapped+0x20/0x20 > >> Dec 18 20:28:38 debvm kernel: [ 0.899912] [<ffffffff81272706>] ? > >> drm_get_pci_dev+0x152/0x259 > >> Dec 18 20:28:38 debvm kernel: [ 0.899915] [<ffffffff813d4883>] ? > >> _raw_spin_lock_irqsave+0x21/0x45 > >> Dec 18 20:28:38 debvm kernel: [ 0.899918] [<ffffffff811d9ecc>] ? > >> local_pci_probe+0x5a/0xa0 > >> Dec 18 20:28:38 debvm kernel: [ 0.899920] [<ffffffff811d9fcf>] ? > >> pci_device_probe+0xbd/0xe7 > >> Dec 18 20:28:38 debvm kernel: [ 0.899922] [<ffffffff812cd887>] ? > >> driver_probe_device+0x1b0/0x1b0 > >> Dec 18 20:28:38 debvm kernel: [ 0.899923] [<ffffffff812cd887>] ? > >> driver_probe_device+0x1b0/0x1b0 > >> Dec 18 20:28:38 debvm kernel: [ 0.899925] [<ffffffff812cd769>] ? > >> driver_probe_device+0x92/0x1b0 > >> Dec 18 20:28:38 debvm kernel: [ 0.899926] [<ffffffff812cd8da>] ? > >> __driver_attach+0x53/0x73 > >> Dec 18 20:28:38 debvm kernel: [ 0.899928] [<ffffffff812cc06f>] ? > >> bus_for_each_dev+0x46/0x77 > >> Dec 18 20:28:38 debvm kernel: [ 0.899930] [<ffffffff812ccf8f>] ? > >> bus_add_driver+0xd5/0x1f4 > >> Dec 18 20:28:38 debvm kernel: [ 0.899931] [<ffffffff812cde14>] ? > >> driver_register+0x89/0x101 > >> Dec 18 20:28:38 debvm kernel: [ 0.899933] [<ffffffff811d9336>] ? > >> __pci_register_driver+0x49/0xa3 > >> Dec 18 20:28:38 debvm kernel: [ 0.899935] [<ffffffff816d55c7>] ? > >> ttm_init+0x63/0x63 > >> Dec 18 20:28:38 debvm kernel: [ 0.899937] [<ffffffff81002085>] ? > >> do_one_initcall+0x75/0x12c > >> Dec 18 20:28:38 debvm kernel: [ 0.899940] [<ffffffff816a6cc2>] ? > >> kernel_init+0x13c/0x1c0 > >> Dec 18 20:28:38 debvm kernel: [ 0.899941] [<ffffffff816a6565>] ? > >> do_early_param+0x83/0x83 > >> Dec 18 20:28:38 debvm kernel: [ 0.899943] [<ffffffff813d9f44>] ? > >> kernel_thread_helper+0x4/0x10 > >> Dec 18 20:28:38 debvm kernel: [ 0.899945] [<ffffffff816a6b86>] ? > >> start_kernel+0x3e1/0x3e1 > >> Dec 18 20:28:38 debvm kernel: [ 0.899947] [<ffffffff813d9f40>] ? > >> gs_change+0x13/0x13 > >> Dec 18 20:28:38 debvm kernel: [ 0.899950] ---[ end trace > >> db461543ce599b44 ]--- > >> > >> I'm not sure if this has anything to do with the freeze. This seems to > >> show up on every boot after I upgraded to xen version 4.2.1-rc2. Both > >> debian kernel 3.2.32 / 3.6.9 suffers from the same log. But whole > >> system freeze happens only during gaming, which is much less frequent. > >> So I'm not sure if the two are related. But anyway, could you comment > >> about what does this log mean? > >> > >> I can find the one of the mentioned address in the qemu_dm log: > >> pt_pci_write_config: [00:02:0] address=00fc val=0xfeff5000 len=4 > >> igd_write_opregion: Map OpRegion: cd996018 -> feff5018 > >> igd_write_opregion: [00:02:0] addr=fc len=2 val=feff5000 > >> > >> PS: I also run xbmc on domU and it playbacks video under HW > >> acceleration (VAAPI) without any problem. XBMC by itself is also an > >> graphics intensive program. But this runs on an pure HVM guest, while > >> the failing case is on PVHVM. > >> > >> PS2: I also suffered another instability yesterday. It happens when I > >> was compiling kernel in side the domU. The host reboots suddenly. > >> Since I'm not using graphics at that time (Xorg session is idle, I > >> connected through SSH), this may be a different issue. > >> > >> Thanks, > >> Timothy > > _______________________________________________ > Xen-devel mailing list > Xen-devel@xxxxxxxxxxxxx > http://lists.xen.org/xen-devel > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |