[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen 4.6.1 crash with altp2m enabled by default


  • To: <JBeulich@xxxxxxxx>
  • From: <Kevin.Mayer@xxxxxxxx>
  • Date: Wed, 3 Aug 2016 13:24:49 +0000
  • Accept-language: de-DE, en-US
  • Cc: andrew.cooper3@xxxxxxxxxx, xen-devel@xxxxxxxxxxxxx
  • Delivery-date: Wed, 03 Aug 2016 13:25:10 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xen.org>
  • Thread-index: AdHpan9MOz4yjQpsRm+be9rxBdX5KAABXWeAAMyAUGAAAdu7AAA3ibpA
  • Thread-topic: [Xen-devel] Xen 4.6.1 crash with altp2m enabled by default

Hi guys

I got around to take a closer look at the crash dump today.

tl;dr:
You were right, vmx_vmenter_helper is not called at all in the call stack.
The real reason behind the [<ffff82d0801fd23a>] vmx_vmenter_helper+0x27e/0x30a 
should be a failed
__vmwrite(HOST_CR0, v->arch.hvm_vmx.host_cr0); in static void 
vmx_fpu_leave(struct vcpu *v).
Long story in "Chapter1".

Concerning the stray vmx_vcpu_update_eptp:
This seems to be leftovers (either due to a corrupted stack or simply 
uninitialized local variables) of previous function calls originating in 
hvm_vcpu_destroy.
More precisely:
if ( hvm_altp2m_supported() )
        altp2m_vcpu_destroy(v);
being called BEFORE free_compat_arg_xlat.
I assume some kind of error in the altp2m_vcpu_destroy-path to be responsible 
for the crash, but I have no idea where and how to start investigating.
Long story in "Chapter2".



Chapter1:
I started with a function in the callstack and followed the assembly code to 
deduce where the (XEN)    [<ffff82d0801fd23a>] vmx_vmenter_helper+0x27e/0x30a 
comes from:

sync_local_execstate:
0xffff82d080178c36 <sync_local_execstate+1>: mov    %rsp,%rbp
0xffff82d080178c39 <sync_local_execstate+4>: callq  0xffff82d080178bbb 
<__sync_local_execstate>
0xffff82d080178c3e <sync_local_execstate+9>: pop    %rbp

__sync_local_execstate:
0xffff82d080178c09 <__sync_local_execstate+78>:      cmp    %rsi,0x7fe8(%rax)
0xffff82d080178c10 <__sync_local_execstate+85>:      je     0xffff82d080178c14 
<__sync_local_execstate+89>
0xffff82d080178c12 <__sync_local_execstate+87>:      ud2
0xffff82d080178c13 <__sync_local_execstate+88>:      or     %eax,%ebp
0xffff82d080178c15 <__sync_local_execstate+90>:      popfq
0xffff82d080178c16 <__sync_local_execstate+91>:      retq   $0xffff
0xffff82d080178c19 <__sync_local_execstate+94>:      and    $0x200,%ebx

Here crash / gdb seem to get confused with the je.

crash> x /3i __sync_local_execstate+89
   0xffff82d080178c14 <__sync_local_execstate+89>:      callq  
0xffff82d080174eb6 <__context_switch>
   0xffff82d080178c19 <__sync_local_execstate+94>:      and    $0x200,%ebx
   0xffff82d080178c1f <__sync_local_execstate+100>:     pushfq
   
It seems this code calls the __context_switch:
switch_required = (this_cpu(curr_vcpu) != current);

    if ( switch_required )
    {
        ASSERT(current == idle_vcpu[smp_processor_id()]);
        __context_switch();
    }
    
Up to the __context_switch everything seems to be running as it should. Except 
for the stray
[ffff830617fd7d38] vmx_vcpu_update_eptp at ffff82d0801f7c6b
which can be found in the "crash> bt" output but not in the "dmesg".

__context_switch:
0xffff82d080174f7f <__context_switch+201>:   mov    %r14,%rdi
0xffff82d080174f82 <__context_switch+204>:   callq  0xffff82d08017c474 
<vcpu_save_fpu>
0xffff82d080174f87 <__context_switch+209>:   mov    %r14,%rdi
0xffff82d080174f8a <__context_switch+212>:   callq  *0x3a8(%r14)

Following r14 / rdi ( 0xffff8300e6fc0000 ) as given in the crash dump seemingly 
leads to a vtable with a function pointer at the offset 0x3a8:
0xffff82d0801fa06e
crash> x /i 0xffff82d0801fa06e
   0xffff82d0801fa06e <vmx_ctxt_switch_from>:   push   %rbp

This call, which does not show up in the backtrace, is expected at this 
position when looking at the C-code:
static void __context_switch(void)
[...]
if ( !is_idle_domain(pd) )
{
    memcpy(&p->arch.user_regs, stack_regs, CTXT_SWITCH_STACK_BYTES);
    vcpu_save_fpu(p);
    p->arch.ctxt_switch_from(p);
[...]

as it is set in:
static int vmx_vcpu_initialise(struct vcpu *v)
[...]
    v->arch.ctxt_switch_from = vmx_ctxt_switch_from;
[...]

Finally at:

0xffff82d0801fa0c3 <vmx_ctxt_switch_from+85>:        mov    $0x6c00,%edx
0xffff82d0801fa0c8 <vmx_ctxt_switch_from+90>:        vmwrite %rax,%rdx
0xffff82d0801fa0cb <vmx_ctxt_switch_from+93>:        jbe    0xffff82d0801fd23a

The jump to [<ffff82d0801fd23a>] vmx_vmenter_helper+0x27e/0x30a (ud2 following 
vmx_vmenter_helper) is done.
vmx_ctxt_switch_from is rather short in C and the called static functions are 
inlined.
static void vmx_ctxt_switch_from(struct vcpu *v)
{
    /*
     * Return early if trying to do a context switch without VMX enabled,
     * this can happen when the hypervisor shuts down with HVM guests
     * still running.
     */
    if ( unlikely(!this_cpu(vmxon)) )
        return;

    vmx_fpu_leave(v);
    vmx_save_guest_msrs(v);
    vmx_restore_host_msrs();
    vmx_save_dr(v);
}

The unlikely path is not taken and the two ud2 (I assume the ud2 are the 
ASSERTs in vmx_fpu_leave?) are not reached either:
0xffff82d0801fa077 <vmx_ctxt_switch_from+9>: lea 0x15c692(%rip),%rax        # 
0xffff82d080356710 <per_cpu__vmxon>
0xffff82d0801fa07e <vmx_ctxt_switch_from+16>:        mov    %rsp,%rdx
0xffff82d0801fa081 <vmx_ctxt_switch_from+19>:        and    
$0xffffffffffff8000,%rdx
0xffff82d0801fa088 <vmx_ctxt_switch_from+26>:        mov 0x7ff0(%rdx),%rdx
0xffff82d0801fa08f <vmx_ctxt_switch_from+33>:        cmpb   $0x0,(%rdx,%rax,1)
0xffff82d0801fa093 <vmx_ctxt_switch_from+37>:        je  0xffff82d0801fa1d9 
<vmx_ctxt_switch_from+363>
0xffff82d0801fa099 <vmx_ctxt_switch_from+43>:        cmpb   $0x0,0x109(%rdi)
0xffff82d0801fa0a0 <vmx_ctxt_switch_from+50>:        je  0xffff82d0801fa0a4 
<vmx_ctxt_switch_from+54>
0xffff82d0801fa0a2 <vmx_ctxt_switch_from+52>:        ud2
0xffff82d0801fa0a3 <vmx_ctxt_switch_from+53>:        or     (%rdi),%ecx
0xffff82d0801fa0a5 <vmx_ctxt_switch_from+55>:        and    %al,%al
0xffff82d0801fa0a7 <vmx_ctxt_switch_from+57>:        test   $0x8,%al
0xffff82d0801fa0a9 <vmx_ctxt_switch_from+59>:        jne 0xffff82d0801fa0ad 
<vmx_ctxt_switch_from+63>
0xffff82d0801fa0ab <vmx_ctxt_switch_from+61>:        ud2
0xffff82d0801fa0ac <vmx_ctxt_switch_from+62>:        or     -0x75(%rax),%ecx
0xffff82d0801fa0af <vmx_ctxt_switch_from+65>:        xchg   %eax,(%rax)
0xffff82d0801fa0b1 <vmx_ctxt_switch_from+67>:        (bad)
0xffff82d0801fa0b2 <vmx_ctxt_switch_from+68>:        add    %al,(%rax)
0xffff82d0801fa0b4 <vmx_ctxt_switch_from+70>:        test   $0x8,%al
0xffff82d0801fa0b6 <vmx_ctxt_switch_from+72>:        jne 0xffff82d0801fa0d1 
<vmx_ctxt_switch_from+99>
0xffff82d0801fa0b8 <vmx_ctxt_switch_from+74>:        or     $0x8,%rax
0xffff82d0801fa0bc <vmx_ctxt_switch_from+78>:        mov    %rax,0x700(%rdi)
0xffff82d0801fa0c3 <vmx_ctxt_switch_from+85>:        mov    $0x6c00,%edx
0xffff82d0801fa0c8 <vmx_ctxt_switch_from+90>:        vmwrite %rax,%rdx
0xffff82d0801fa0cb <vmx_ctxt_switch_from+93>:        jbe 0xffff82d0801fd23a

The two test, jne should be the if ( !(v->arch.hvm_vmx.host_cr0 & X86_CR0_TS) ) 
and if ( !(v->arch.hvm_vcpu.guest_cr[0] & X86_CR0_TS) ) conditions.
mov    $0x6c00,%edx
vmwrite %rax,%rdx
Should be the
__vmwrite(HOST_CR0, v->arch.hvm_vmx.host_cr0);

So the real reason the hypervisor panics should be a failing __vmwrite in 
static void vmx_fpu_leave(struct vcpu *v)
which simply jumps to a location behind vmx_vmenter_helper, thereby creating a 
slightly confusing stack trace.

Chapter2:
This does not seem to have anything to do with the altp2m so I looked at the 
stray vmx_vcpu_update_eptp which can be seen in the bt, but not in the xen 
dmesg.
#10 [ffff830617fd7d28] destroy_perdomain_mapping at ffff82d080196152
#11 [ffff830617fd7d38] vmx_vcpu_update_eptp at ffff82d0801f7c6b
#12 [ffff830617fd7d78] free_compat_arg_xlat at ffff82d080244a62

This function is not called by free_compat_arg_xlat:
void free_compat_arg_xlat(struct vcpu *v)
{
    destroy_perdomain_mapping(v->domain, ARG_XLAT_START(v),
                              PFN_UP(COMPAT_ARG_XLAT_SIZE));
}

Instead it is called BEFORE free_compat_arg_xlat in 
hvm_vcpu_destroy->altp2m_vcpu_destroy->altp2m_vcpu_update_p2m->hvm_funcs.altp2m_vcpu_update_p2m
 (.altp2m_vcpu_update_p2m = vmx_vcpu_update_eptp,) :
void hvm_vcpu_destroy(struct vcpu *v)
{
    hvm_all_ioreq_servers_remove_vcpu(v->domain, v);

    if ( hvm_altp2m_supported() )
        altp2m_vcpu_destroy(v);

    nestedhvm_vcpu_destroy(v);

    free_compat_arg_xlat(v);
[...]

Here the enabling of the altp2m has an effect, but I have no idea how it could 
lead to a failing __vmwrite.
Any ideas where in the altp2m-code the error could be, or how I could help in 
finding it?

Cheers

Kevin

> -----Ursprüngliche Nachricht-----
> Von: Jan Beulich [mailto:JBeulich@xxxxxxxx]
> Gesendet: Dienstag, 2. August 2016 14:34
> An: Mayer, Kevin <Kevin.Mayer@xxxxxxxx>
> Cc: andrew.cooper3@xxxxxxxxxx; xen-devel@xxxxxxxxxxxxx
> Betreff: Re: [Xen-devel] Xen 4.6.1 crash with altp2m enabled by default
> 
> >>> On 02.08.16 at 13:45, <Kevin.Mayer@xxxxxxxx> wrote:
> > (XEN) ----[ Xen-4.6.1  x86_64  debug=y  Not tainted ]----
> > (XEN) CPU:    6
> > (XEN) RIP:    e008:[<ffff82d0801fd23a>]
> vmx_vmenter_helper+0x27e/0x30a
> > (XEN) RFLAGS: 0000000000010003   CONTEXT: hypervisor
> > (XEN) rax: 000000008005003b   rbx: ffff8300e72fc000   rcx:
> 0000000000000000
> > (XEN) rdx: 0000000000006c00   rsi: ffff830617fd7fc0   rdi: ffff8300e6fc0000
> > (XEN) rbp: ffff830617fd7c40   rsp: ffff830617fd7c30   r8:  0000000000000000
> > (XEN) r9:  ffff830be8dc9310   r10: 0000000000000000   r11: 00003475e9cf85d0
> > (XEN) r12: 0000000000000006   r13: ffff830c14ee1000   r14: ffff8300e6fc0000
> > (XEN) r15: ffff830617fd0000   cr0: 000000008005003b   cr4:
> 00000000000026e0
> > (XEN) cr3: 00000001bd665000   cr2: 0000000004510000
> > (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
> > (XEN) Xen stack trace from rsp=ffff830617fd7c30:
> > (XEN)    ffff830617fd7c40 ffff8300e72fc000 ffff830617fd7ca0
> ffff82d080174f91
> > (XEN)    ffff830617fd7f18 ffff830be8dc9000 0000000000000286
> ffff830617fd7c90
> > (XEN)    0000000000000206 0000000000000246 0000000000000001
> ffff830617e91250
> > (XEN)    ffff8300e72fc000 ffff830be8dc9000 ffff830617fd7cc0
> ffff82d080178c19
> > (XEN)    0000000000bdeeae ffff8300e72fc000 ffff830617fd7cd0
> ffff82d080178c3e
> > (XEN)    ffff830617fd7d20 ffff82d080179740 ffff8300e6fc2000
> ffff830c17e38e80
> > (XEN)    ffff830617e91250 ffff820080000000 0000000000000002
> ffff830617e91250
> > (XEN)    ffff830617e91240 ffff830be8dc9000 ffff830617fd7d70
> ffff82d080196152
> > (XEN)    ffff830617fd7d50 ffff82d0801f7c6b ffff8300e6fc2000
> ffff830617e91250
> > (XEN)    ffff8300e6fc2000 ffff830617e91250 ffff830617e91240
> ffff830be8dc9000
> > (XEN)    ffff830617fd7d80 ffff82d080244a62 ffff830617fd7db0
> ffff82d0801d3fe2
> > (XEN)    ffff8300e6fc2000 0000000000000000 ffff830617e91f28
> ffff830617e91000
> > (XEN)    ffff830617fd7dd0 ffff82d080175c2c ffff8300e6fc2000
> ffff8300e6fc2000
> > (XEN)    ffff830617fd7e00 ffff82d080105dd4 ffff830c17e38040
> 0000000000000000
> > (XEN)    0000000000000000 ffff830617fd0000 ffff830617fd7e30
> ffff82d0801215fd
> > (XEN)    ffff8300e6fc0000 ffff82d080329280 ffff82d080328f80 fffffffffffffffd
> > (XEN)    ffff830617fd7e60 ffff82d08012caf8 0000000000000006
> ffff830c17e3bc60
> > (XEN)    0000000000000002 ffff830c17e3bbe0 ffff830617fd7e70
> ffff82d08012cb3b
> > (XEN)    ffff830617fd7ef0 ffff82d0801c23a8 ffff8300e72fc000 ffffffffffffffff
> > (XEN)    ffff82d0801f3200 ffff830617fd7f08 ffff82d080329280
> 0000000000000000
> > (XEN) Xen call trace:
> > (XEN)    [<ffff82d0801fd23a>] vmx_vmenter_helper+0x27e/0x30a
> > (XEN)    [<ffff82d080174f91>] __context_switch+0xdb/0x3b5
> > (XEN)    [<ffff82d080178c19>] __sync_local_execstate+0x5e/0x7a
> > (XEN)    [<ffff82d080178c3e>] sync_local_execstate+0x9/0xb
> > (XEN)    [<ffff82d080179740>] map_domain_page+0xa0/0x5d4
> > (XEN)    [<ffff82d080196152>] destroy_perdomain_mapping+0x8f/0x1e8
> > (XEN)    [<ffff82d080244a62>] free_compat_arg_xlat+0x26/0x28
> > (XEN)    [<ffff82d0801d3fe2>] hvm_vcpu_destroy+0x73/0xb0
> > (XEN)    [<ffff82d080175c2c>] vcpu_destroy+0x5d/0x72
> > (XEN)    [<ffff82d080105dd4>] complete_domain_destroy+0x49/0x192
> > (XEN)    [<ffff82d0801215fd>] rcu_process_callbacks+0x19a/0x1fb
> > (XEN)    [<ffff82d08012caf8>] __do_softirq+0x82/0x8d
> > (XEN)    [<ffff82d08012cb3b>] process_pending_softirqs+0x38/0x3a
> > (XEN)    [<ffff82d0801c23a8>] mwait_idle+0x10c/0x315
> > (XEN)    [<ffff82d080174825>] idle_loop+0x51/0x6b
> 
> On this deep a stack execution can't validly end up in
> vmx_vmenter_helper: That's a function called only when the stack is almost
> empty. Nor is the caller of it the context switch code.
> Hence your problem starts quite a bit earlier - perhaps memory corruption?
> 
> Jan
____________
Virus checked by G Data MailSecurity
Version: AVA 25.7708 dated 03.08.2016
Virus news: www.antiviruslab.com

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.