[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Xen-4.3 - curious crash



Hello,

Last night, XenRT discovered an interesting host crash.  The crash
itself somewhat concerning, but lack of information does highlight an
area which could do with easier debugability.

Here is the results from the serial console.  The server in question is
a Supermicro Xeon X5376 system which has not exhibited stability issues
in the past, and seems fine for tests during today.

I have linearised the stack and applied notes beside.

----[ Xen-4.3.1-xs82408-d  x86_64  debug=y  Not tainted ]----
CPU:    4
RIP:    e008:[<ffff82c4c0235a92>] compat_create_bounce_frame+0x8/0xec
RFLAGS: 0000000000010046   CONTEXT: hypervisor
rax: 0000000000000061   rbx: ffff8300cfafa000   rcx: ffff82c4c02ffd80
rdx: ffff8300cfafa570   rsi: ffff83022eacfd00   rdi: ffff8300cfafa000
rbp: ffff83022eacfd60   rsp: ffff83022eacff08   r8:  0000000000000000
r9:  0000000000000000   r10: ffff83022ead32e8   r11: 00001ac42042804f
r12: ffff8300cfafa000   r13: 0000000000000004   r14: ffff8300cfd3f000
r15: 0000000000000001   cr0: 000000008005003b   cr4: 00000000000026f0
cr3: 0000000228dde000   cr2: 00000000b74e4f10
ds: 007b   es: 007b   fs: 00d8   gs: 00e0   ss: 0000   cs: e008
Xen stack trace from rsp=ffff83022eacff08:
    0000000000000093 | rflags from pushfq in ASSERT_INTERRUPTS_ENABLED
    ffff82c4c02358d8 | RA? compat/entry.S:123 in compat_test_all_events()
    0000000000000001 | r15
    ffff8300cfd3f000 | r14
    0000000000000004 | r13
    ffff8300cfafa000 | r12
    00000000c1695ec0 | ebp
    00000000deadbeef | ebx
    0000000000000000 | r11
    00000000deadbeef | r10
    ffff8300cfafa060 | r9
    0000000000000000 | r8
    0000000000000000 | eax
    00000000deadbeef | ecx
    00000000ee8507a0 | edx
    00000000c23a7000 | esi
    0000000000000000 | edi
    0002010000000000 | TRAP_syscall | TRAP_regs_dirty
    00000000c10013a7 + (hypercall page) __HYPERCALL_sched_op
    0000000000000061 |
    0000000000000246 | Exception frame from ring1 kernel
    00000000c1695eb0 |
    0000000000000069 +
    0000000000000000 | es
    0000000000000000 | ds
    0000000000000000 | fs
    0000000000000000 | gs
    0000000000000004 | cpu_info.processor_id
    ffff8300cfafa000 | cpu_info.current_vcpu
    0000003d6e797180 | cpu_info.per_cpu_offset
    0000000000000000 +

Xen call trace:
   [<ffff82c4c0235a92>] compat_create_bounce_frame+0x8/0xec


Xen has failed the ASSERT_INTERRUPTS_ENABLED check at the very top of
compat_create_bounce_frame, which itself lacks a bugframe which is why
it is not automatically recognised as an assertion.

Following the code back using what I presume to be a return address as
the penultimate word on the stack, the codeflow looks like:

compat_test_all_events:
  ...
  sti
  leaq ...
  5x mov ...
  call compat_create_bounce_frame
  jmp  compat_test_all_events

compat_create_bounce_frame:
  pushfq
  testb
  jnz
  ud2


What I presume has happened is that after 'sti', Xen has taken an
interrupt, which has caused some form of corruption.  Judging from the
top word on the stack, rflags looks quite corrupt.  Unfortunatly, this
is all the available information.  (The crash kernel failed to boot
which is another issue I am looking into).

For crashes like this, particularly when attempting to leave Xen context
and return back to a guest, the information provided by the stack trace
is quite lacking; The interesting information is what is what has just
been popped off the stack (which I am hoping would have been another
exception frame)

Would it be sensible to have some indication that we are on the way out
of Xen, so errors in situations like this can take a chance to print
some of the recently popped stack values? I know it wont be terribly
heavily used debugging, but think it is probably worth the effort for
situations like this where there is simply not enough information to
diagnose the issue.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.