[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen-4.3 - curious crash



>>> On 28.01.14 at 21:25, Andrew Cooper <andrew.cooper3@xxxxxxxxxx> wrote:
>     0000000000000093 | rflags from pushfq in ASSERT_INTERRUPTS_ENABLED
>     ffff82c4c02358d8 | RA? compat/entry.S:123 in compat_test_all_events()
>     0000000000000001 | r15
>     ffff8300cfd3f000 | r14
>     0000000000000004 | r13
>     ffff8300cfafa000 | r12
>     00000000c1695ec0 | ebp
>     00000000deadbeef | ebx
>     0000000000000000 | r11
>     00000000deadbeef | r10
>     ffff8300cfafa060 | r9
>     0000000000000000 | r8
>     0000000000000000 | eax
>     00000000deadbeef | ecx
>     00000000ee8507a0 | edx
>     00000000c23a7000 | esi
>     0000000000000000 | edi
>     0002010000000000 | TRAP_syscall | TRAP_regs_dirty
>     00000000c10013a7 + (hypercall page) __HYPERCALL_sched_op
>     0000000000000061 |
>     0000000000000246 | Exception frame from ring1 kernel
>     00000000c1695eb0 |
>     0000000000000069 +
>     0000000000000000 | es
>     0000000000000000 | ds
>     0000000000000000 | fs
>     0000000000000000 | gs
>     0000000000000004 | cpu_info.processor_id
>     ffff8300cfafa000 | cpu_info.current_vcpu
>     0000003d6e797180 | cpu_info.per_cpu_offset
>     0000000000000000 +
> 
> Xen call trace:
>    [<ffff82c4c0235a92>] compat_create_bounce_frame+0x8/0xec
> 
> 
> Xen has failed the ASSERT_INTERRUPTS_ENABLED check at the very top of
> compat_create_bounce_frame, which itself lacks a bugframe which is why
> it is not automatically recognised as an assertion.
> 
> Following the code back using what I presume to be a return address as
> the penultimate word on the stack, the codeflow looks like:
> 
> compat_test_all_events:
>   ...
>   sti
>   leaq ...
>   5x mov ...
>   call compat_create_bounce_frame
>   jmp  compat_test_all_events
> 
> compat_create_bounce_frame:
>   pushfq
>   testb
>   jnz
>   ud2
> 
> 
> What I presume has happened is that after 'sti', Xen has taken an
> interrupt, which has caused some form of corruption.  Judging from the
> top word on the stack, rflags looks quite corrupt.

Other that IF being clear, I see no other obvious corruption:
CF, AF, and SF (and the reserved bit 1) are set, and all other flags
are clear. Quite reasonable a state after the "cmpl  $0xfe,%eax"
(being the most recent instruction that affected the flags) it seems.

An interrupt not properly restoring EFLAGS.IF (or actually one not
properly restoring all of EFLAGS) would be very odd. About as odd
as a cosmic radiation induced bit flip resulting in some other
misbehavior. This hasn't been seen more than once I suppose?

> For crashes like this, particularly when attempting to leave Xen context
> and return back to a guest, the information provided by the stack trace
> is quite lacking; The interesting information is what is what has just
> been popped off the stack (which I am hoping would have been another
> exception frame)
> 
> Would it be sensible to have some indication that we are on the way out
> of Xen, so errors in situations like this can take a chance to print
> some of the recently popped stack values? I know it wont be terribly
> heavily used debugging, but think it is probably worth the effort for
> situations like this where there is simply not enough information to
> diagnose the issue.

While I realize that in a case like this seeing stack contents below the
stack pointer may be useful (but there's no guarantee it would be), I
don't think it is reasonable to get the code prepared for all kinds of
extremely unlikely scenarios to be debuggable. If the issue here is
reproducible, I'm sure you'll be able to instrument the code such that
you can get further information out of the system (and that's not
necessarily just stack contents - presumably you'd want to track
other state or state changes in some kind of static buffer, which
you'd then also want to dump out at the point of the crash).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.