[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Xen 4.6.1 crash with altp2menabledbydefault
>>> On 22.09.16 at 17:11, <Kevin.Mayer@xxxxxxxx> wrote: > Here is a call stack from dmesg. > Keep in mind that the compiler omits some function names (most importantly > the vmx_fpu_leave) and also that vmx_vmenter_helper is not actually called. > The backtrace just thinks it is called because the ud2 which panics the > hypervisor lies somewhere behind its epilogue. > > (XEN) Xen call trace: > (XEN) [<ffff82d0801fd8ec>] vmx_vmenter_helper+0x280/0x30a > (XEN) [<ffff82d080174f91>] __context_switch+0xdb/0x3b5 > (XEN) [<ffff82d080178c19>] __sync_local_execstate+0x5e/0x7a > (XEN) [<ffff82d080178c3e>] sync_local_execstate+0x9/0xb > (XEN) [<ffff82d080179740>] map_domain_page+0xa0/0x5d4 > (XEN) [<ffff82d080196152>] destroy_perdomain_mapping+0x8f/0x1e8 > (XEN) [<ffff82d080244a62>] free_compat_arg_xlat+0x26/0x28 > (XEN) [<ffff82d0801d4081>] hvm_vcpu_destroy+0x112/0x176 > (XEN) [<ffff82d080175c2c>] vcpu_destroy+0x5d/0x72 > (XEN) [<ffff82d080105dd4>] complete_domain_destroy+0x49/0x192 > (XEN) [<ffff82d0801215fd>] rcu_process_callbacks+0x19a/0x1fb > (XEN) [<ffff82d08012caf8>] __do_softirq+0x82/0x8d > (XEN) [<ffff82d08012cb3b>] process_pending_softirqs+0x38/0x3a > (XEN) [<ffff82d0801c23a8>] mwait_idle+0x10c/0x315 > (XEN) [<ffff82d080174825>] idle_loop+0x51/0x6b So one possible solution would be to simply avoid calling altp2m_vcpu_update_p2m() and altp2m_vcpu_update_vmfunc_ve() from altp2m_vcpu_destroy() for dying domains. However, it looks as if this would still only paper over the underlying problem. Yet I continue to have difficulty seeing how we can end up with the call stack above, without some other earlier bug: I don't think un-paused vCPU-s are supposed to make it into vcpu_destroy(). Yet at the moment a vCPU gets paused, sync_vcpu_execstate() would have got called for it already. And while both vcpu_check_shutdown() and domain_shutdown() call vcpu_pause_nosync() (which hence wouldn't result in the needed call to sync_vcpu_execstate()), domain_kill() calls domain_pause() first thing, while it drops the domain reference almost last thing. And only the dropping of the last domain reference can cause execution to reach complete_domain_destroy(). Could you verify this is what is actually happening, i.e. you're not suffering from a stray put_domain() somewhere? And just to double check - you're not having any other code changes in your tree beyond the default enabling of altp2m? Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |