On Nov 5, 2013, at 2:29 PM, "Mallick, Asit K"
<asit.k.mallick@xxxxxxxxx>
wrote:
> Jeff,
> Could you check if you you have latest microcode
updates installed on this system? Or, could you send me
the microcode rev and I can check.
>
> Thanks,
> Asit
>
>
> From:
"
Jeff_Zimmerman@xxxxxxxxxx<mailto:Jeff_Zimmerman@xxxxxxxxxx>"
<
Jeff_Zimmerman@xxxxxxxxxx<mailto:Jeff_Zimmerman@xxxxxxxxxx>>
> Date: Tuesday, November 5, 2013 2:55 PM
> To:
"
lars.kurth@xxxxxxx<mailto:lars.kurth@xxxxxxx>"
<
lars.kurth@xxxxxxx<mailto:lars.kurth@xxxxxxx>>
> Cc:
"
lars.kurth.xen@xxxxxxxxx<mailto:lars.kurth.xen@xxxxxxxxx>"
<
lars.kurth.xen@xxxxxxxxx<mailto:lars.kurth.xen@xxxxxxxxx>>,
"
xen-devel@xxxxxxxxxxxxxxxxxxxx<mailto:xen-devel@xxxxxxxxxxxxxxxxxxxx>"
<
xen-devel@xxxxxxxxxxxxxxxxxxxx<mailto:xen-devel@xxxxxxxxxxxxxxxxxxxx>>,
"
JBeulich@xxxxxxxx<mailto:JBeulich@xxxxxxxx>"
<
JBeulich@xxxxxxxx<mailto:JBeulich@xxxxxxxx>>
> Subject: Re: [Xen-devel] Intermittent fatal page
fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel
3.10.16.)
>
> Lars,
> I understand the mailing list limits attachment size
to 512K. Where can I post the xen binary an symbols file?
> Jeff
>
> On Nov 5, 2013, at 7:46 AM, Lars Kurth
<
lars.kurth@xxxxxxx<mailto:lars.kurth@xxxxxxx>>
wrote:
>
> Jan, Andrew, Ian,
>
> pulling in Jeff who raised the question. Snippets
from misc replies attached. Jeff, please look through
these (in particular Jan's answer) and answer any further
questions on this thread.
>
> On 05/11/2013 09:53, Ian Campbell wrote:
>> TBH I think for this kind of thing (i.e. a bug
not a user question) the most appropriate thing to
>> do would be to redirect them to xen-devel
themselves (with a reminder that they do not need
>> to subscribe to post).
> Agreed. Another option is for me to start the thread
and pull in the raiser of the thread into it, if it is a
bug. Was not sure this was a real bug at first, but it
seems it is.
>
> On 04/11/2013 20:00, Andrew Cooper wrote:
>> Which version of Xen were these images saved on?
> [Jeff] We were careful to regenerate all the images
after upgrading the 4.3.1. Also saw the same problem on
4.3.0.
>
>> Are you expecting to be using nested-virt? (It is
still very definitely experimental)
> [Jeff] Not using nested-virt.
>
> On 05/11/2013 10:04, Jan Beulich wrote:
>
> On 04.11.13 at 20:54, Lars Kurth
<lars.kurth.xen@xxxxxxxxx><mailto:lars.kurth.xen@xxxxxxxxx>
wrote:
>
>
> See
>
http://xenproject.org/help/questions-and-answers/hypervisor-fatal-page-fault-xen-4-3-
> 1.html
> ---
> I have a 32 core system running XEN 4.3.1 with 30
Windows XP VM's.
> DOM0 is Centos 6.3 based with linux kernel 3.10.16.
> In my configuration all of the windows HVMs are
running having been
> restored from xl save.
> VM's are destroyed or restored in an on-demand
fashion. After some time XEN
> will experience a fatal page fault while restoring
one of the windows HVM
> subjects. This does not happen very often, perhaps
once in a 16 to 48 hour
> period.
> The stack trace from xen follows. Thanks in advance
for any help.
>
> (XEN) ----[ Xen-4.3.1 x86_64 debug=n Tainted: C ]----
> (XEN) CPU: 52
> (XEN) RIP: e008:[] domain_page_map_to_mfn+0x86/0xc0
>
>
> Zapping addresses (here and below in the stack trace)
is never
> helpful when someone asks for help with a crash.
Also, in order
> to not just guess, the matching xen-syms or xen.efi
should be
> made available or pointed to.
>
>
>
> (XEN) RFLAGS: 0000000000010246 CONTEXT: hypervisor
> (XEN) rax: 000ffffffffff000 rbx: ffff8300bb163760
rcx: 0000000000000000
> (XEN) rdx: ffff810000000000 rsi: 0000000000000000
rdi: 0000000000000000
> (XEN) rbp: ffff8300bb163000 rsp: ffff8310333e7cd8 r8:
0000000000000000
> (XEN) r9: 0000000000000000 r10: 0000000000000000 r11:
0000000000000000
> (XEN) r12: ffff8310333e7f18 r13: 0000000000000000
r14: 0000000000000000
> (XEN) r15: 0000000000000000 cr0: 0000000080050033
cr4: 00000000000426f0
> (XEN) cr3: 000000211bee5000 cr2: ffff810000000000
> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000
cs: e008
> (XEN) Xen stack trace from rsp=ffff8310333e7cd8:
> (XEN) 0000000000000001 ffff82c4c01de869
ffff82c4c0182c70 ffff8300bb163000
> (XEN) 0000000000000014 ffff8310333e7f18
0000000000000000 ffff82c4c01d7548
> (XEN) ffff8300bb163490 ffff8300bb163000
ffff82c4c01c65b8 ffff8310333e7e60
> (XEN) ffff82c4c01badef ffff8300bb163000
0000000000000003 ffff833144d8e000
> (XEN) ffff82c4c01b4885 ffff8300bb163000
ffff8300bb163000 ffff8300bdff1000
> (XEN) 0000000000000001 ffff82c4c02f2880
ffff82c4c02f2880 ffff82c4c0308440
> (XEN) ffff82c4c01d0ea8 ffff8300bb163000
ffff82c4c015ad6c ffff82c4c02f2880
> (XEN) ffff82c4c02cf800 00000000ffffffff
ffff8310333f5060 ffff82c4c02f2880
> (XEN) 0000000000000282 0010000000000000
0000000000000000 0000000000000000
> (XEN) 0000000000000000 ffff82c4c02f2880
ffff8300bdff1000 ffff8300bb163000
> (XEN) 000031a10f2b16ca 0000000000000001
ffff82c4c02f2880 ffff82c4c0308440
> (XEN) ffff82c4c0124444 0000000000000034
ffff8310333f5060 0000000001c9c380
> (XEN) 00000000c0155965 ffff82c4c01c6146
0000000001c9c380 ffffffffffffff00
> (XEN) ffff82c4c0128fa8 ffff8300bb163000
ffff8327d50e9000 ffff82c4c01bc490
> (XEN) 0000000000000000 ffff82c4c01dd254
0000000080549ae0 ffff82c4c01cfc3c
> (XEN) ffff8300bb163000 ffff82c4c01d6128
ffff82c4c0125db9 ffff82c4c0125db9
> (XEN) ffff8310333e0000 ffff8300bb163000
000000000012ffc0 0000000000000000
> (XEN) 0000000000000000 0000000000000000
0000000000000000 ffff82c4c01deaa3
> (XEN) 0000000000000000 0000000000000000
0000000000000000 0000000000000000
> (XEN) 000000000012ffc0 000000007ffdf000
0000000000000000 0000000000000000
> (XEN) Xen call trace:
> (XEN) [] domain_page_map_to_mfn+0x86/0xc0
> (XEN) [] nvmx_handle_vmlaunch+0x49/0x160
> (XEN) [] __update_vcpu_system_time+0x240/0x310
> (XEN) [] vmx_vmexit_handler+0xb58/0x18c0
> (XEN) [] pt_restore_timer+0xa8/0xc0
> (XEN) [] hvm_io_assist+0xef/0x120
> (XEN) [] hvm_do_resume+0x195/0x1c0
> (XEN) [] vmx_do_resume+0x148/0x210
> (XEN) [] context_switch+0x1bc/0xfc0
> (XEN) [] schedule+0x254/0x5f0
> (XEN) [] pt_update_irq+0x256/0x2b0
> (XEN) [] timer_softirq_action+0x168/0x210
> (XEN) [] hvm_vcpu_has_pending_irq+0x50/0xb0
> (XEN) [] nvmx_switch_guest+0x54/0x1560
> (XEN) [] vmx_intr_assist+0x6c/0x490
> (XEN) [] vmx_vmenter_helper+0x88/0x160
> (XEN) [] __do_softirq+0x69/0xa0
> (XEN) [] __do_softirq+0x69/0xa0
> (XEN) [] vmx_asm_do_vmentry+0/0xed
> (XEN)
> (XEN) Pagetable walk from ffff810000000000:
> (XEN) L4[0x102] = 000000211bee5063 ffffffffffffffff
> (XEN) L3[0x000] = 0000000000000000 ffffffffffffffff
>
>
> This makes me suspect that domain_page_map_to_mfn()
gets a
> NULL pointer passed here. As said above, this is only
guesswork
> at this point, and as Ian already pointed out,
directing the
> reporter to xen-devel would seem to be the right
thing to do
> here anyway.
>
> Jan
>
>
>