[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Re: Xen 4.1 interrupts not delievered.



Hi Keir,

I don't know if it can give any insights, but i tried running a xentrace, the 
only thing i don't know is how close to the real freeze has made it to disk ...

In these last 2 seconds of the trace i do see some times:

 169.940118823 ||xl d1v0 hypercall 17 (iret) eip ffffffff810012eb
 169.940119616 ||xl d1v0 hypercall 11 (xen_version) eip ffffffff8100122a
 169.940120050 ||xl d1v0 hypercall 11 (xen_version) eip ffffffff8100122a
 169.940120540 ||xl d1v0 hypercall 1d (sched_op) eip ffffffff810013aa
]169.940120843 ||xl d1v0   28006(2:8:6) 2 [ 1 0 ]
]169.940122066 ||xl d1v0   2800e(2:8:e) 2 [ 1 6db9 ]
]169.940122206 ||xl d1v0   2800f(2:8:f) 3 [ 0 6db9 1c9c380 ]
]169.940122393 ||xl d1v0   2800a(2:8:a) 4 [ 1 0 0 2 ]
 169.940122586 ||xl d1v0 runstate_change d1v0 running->blocked
sched_runstate_process: 1 lost cpus, setting d1v0 runstate to RUNSTATE_LOST
 169.940122820 ||xl d?v? runstate_change d0v2 runnable->running
 169.940124900 |x|l d0v0 page_fault[ db3124a0 2b9e dc0d1000 2b9e 6 ]
 169.940125350 ||xl d0v2 hypercall 11 (xen_version) eip ffffffff8100922a
 169.940125986 ||xl d0v2 hypercall 11 (xen_version) eip ffffffff8100922a
 169.940126983 |x|l d0v0 hypercall 11 (xen_version) eip ffffffff8100922a
 169.940127210 ||xl d0v2 emulate privop[ 8167dc5e ffffffff ]
 169.940127773 ||xl d0v2 emulate privop[ 8167dca6 ffffffff ]

 But perhaps that sounds worse than it actually is.

 This trace was done on:

 - Intel Quad core
 - only 1 domU started, with videograbbing on pci-e xhci controller, device 
using msi-x interrupts
 - xen_changeset : Fri Oct 08 11:41:57 2010 +0100 22230:a33886146b45
 - dom0 kernel jeremy's pvops xen/next last commit 
4ac23c27f34a5ea45a098b0f6e08bf5cc6e74756
 - domU kernel konrad's  pcifront-0.8.1 tree last commit 
369bae8ae5c5e4b122f77726a4c957108ad724ad

Attached:
- last piece of the trace bzip2'ed

--

Sander


Wednesday, October 13, 2010, 9:52:22 AM, you wrote:

> On 13/10/2010 08:00, "Sander Eikelenboom" <linux@xxxxxxxxxxxxxx> wrote:

>> Hello Keir,
>> 
>> OK let's rephrase, in what cases is it logical that the xen serial console
>> freezes together with dom0 ?
>> For example some deadlock causes cpu0 to stall on a heavily loaded system ..
>> I think having the serial console available to dump the machines state is
>> quite vital :-(

> Oh, there was a fix for serial interrupt routing: xen-unstable:22148 or
> xen-4.0-testing:21342. Are you running a more recent hypervisor than that?
> The fix prevents serial interrupt from being migrated away from pcpu0, which
> will not work as there is no vector allocated for it on other pcpus. This
> kind of fits with the bug you're seeing, which doesn't manifest if you leave
> pcpu0 unloaded (and hence presumably serial interrupt binding prefers to
> stay with unloaded pcpu0).

>  -- Keir

>> I have tried the max_cstate=1 together with the latest 2.6.32-xen-next-pvops
>> kernel as dom0 kernel (which Ian's fix to the event channels).
>> But with the compile test it freezes just as fast.
>> Will try xen before changesets 20072/20073 now, probably with 2.6.31 pvops,
>> since 2.6.32 would need a more recent hypervisor.
>> 
>> --
>> Sander
>> 
>> 
>> Wednesday, October 13, 2010, 1:34:58 AM, you wrote:
>> 
>>> On 12/10/2010 18:17, "Konrad Rzeszutek Wilk" <konrad.wilk@xxxxxxxxxx> wrote:
>> 
>>>> A couple of that might fix the problems are:
>>>> 
>>>>  1). Ian's fix to the event channels:
>>>> http://xenbits.xen.org/gitweb?p=people/ianc/linux-2.6.git;a=commit;h=5d30cb2
>>>> a8
>>>> 5912ffb5f6556d55472c26801eef2ea
>>>>  2). Disable IRQ balancing in Xen (and also in Linux kernel). 
>>>> "noirqbalance"
>>>>  3). Pin domains, but nothing to Domain 0.
>> 
>>> ITYM cpu 0. Not that this should rightly make any difference that I can see.
>> 
>>> My suspicion would be the per-CPU IDT patches introduced during 4.0
>>> development. Or changes to enable deep C-state sleeps by default. One or the
>>> other causing lost interrupts. I think the latter can be discounted by
>>> max_cstate=1 as a Xen boot parameter. The former would require trying a
>>> build of Xen before and after changesets 20072/20073 -- they are the ones
>>> that did the heavy lifting to implement per-CPU IDTs.
>> 
>>>  -- Keir
>> 
>>>> But it might be worth trying them out?
>> 
>> 
>> 
>> 





-- 
Best regards,
 Sander                            mailto:linux@xxxxxxxxxxxxxx

Attachment: xen-trace.bz2
Description: Binary data

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.