[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] hypervisor crash look familiar?




We're getting a very difficult to reproduce hypervisor
crash that happens 3.0.4 based bits running PAE 32 bit.
I'm wondering if this looks familiar to anyone and hoping
it's been fixed in later releases. It seems to happen
randomly.

Here's the crash information from one of the occurences:


(XEN) extable.c:77: Pre-exception: ff140fcc -> 00000000
(XEN) ----[ Xen-3.0.4-1-sun  x86_32p  debug=y  Not tainted ]----
(XEN) CPU:    0
(XEN) EIP:    e008:[<ff140fcc>] __spurious_page_fault+0x95/0x407
(XEN) EFLAGS: 00010006   CONTEXT: hypervisor
(XEN) eax: 00000000   ebx: 0004b386   ecx: 00000003   edx: 00000000
(XEN) esi: 4b386000   edi: fedab000   ebp: ff1dbdbc   esp: ff1dbd24
(XEN) cr0: 8005003b   cr4: 000006f0   cr3: 4b386000   cr2: fedab018
(XEN) ds: e010   es: e010   fs: 0000   gs: 01b1   ss: e010   cs: e008
(XEN) Xen stack trace from esp=ff1dbd24:
(XEN)    0004b386 ff1dbf20 000001ad ff133317 f6ea8078 cd400228 00000002
(XEN)    00000000 00000001 fedab000 00000004 ff12fa06 fedad000 ff228100
(XEN)    00000000 00000001 ff228100 f6edc020 ff1dbdcc ff12f300 f6edc020
(XEN)    f6edc030 f6edc030 00000003 80000000 f6edc020 00000000 00000003
(XEN)    f6edc020 000000fd 01228100 80000001 ff228100 00048b62 ff228500
(XEN)    00000282 feda7fcc feda7fcc ff1dbddc ff14135a feda7fcc ff1dbe08
(XEN)    ff228100 ff1dbe08 feda7fcc ff1dbe08 ff1dbdfc ff14171a feda7fcc
(XEN)    ff1dbe08 ff1dbdfc ff13c675 ff234100 ff1dffcc 00e241e3 ff17dd02
(XEN)    ff1dbe08 00000000 0000000d 000002a0 ff1dffcc feda7fcc ff1dbe6c
(XEN)    00000000 000e0002 ff12fbd1 0000e008 00010212 00047005 00000000
(XEN)    ff1dbe6c ff12f8bd f6ed1130 fde69c50 80000000 00000000 feda7000
(XEN)    f6ea8078 fedab000 ff1dbfb4 00000000 00000000 ff1dbebc ff1325c7
(XEN)    fedab000 00000002 ff1dbe9c ff13c74e 00000002 000000fd ff228100
(XEN)    ff1438cc cd75ea7c 00000000 ff1dbfb4 ff228100 48b03001 00000000
(XEN)    f6f0d490 78000002 f6f0d490 0004b386 ff1dbf8c ff135204 fedab000
(XEN)    00000000 00000000 0004b386 48000001 ff1dbf58 ff13a0d3 ff1dbf20
(XEN)    ff1a4a28 80000002 ff1dbf8c ff228100 f73f94d8 a0000000 ff1dbf18
(XEN)    ff1dbfb4 ff228100 ff1b5540 ff1b5540 ff1b5540 f6ed0850 ff1dbfb4
(XEN)    ff228100 ff234100 00000000 00000000 00000000 fedab000 ff228100
(XEN)    00000000 00000000 ff234100 00000000 00000000 00000000 00000001
(XEN) Xen call trace:
(XEN)    [<ff140fcc>] __spurious_page_fault+0x95/0x407
(XEN)    [<ff14135a>] spurious_page_fault+0x1c/0x24
(XEN)    [<ff14171a>] do_page_fault+0x1b5/0x21d
(XEN)    [<ff17dd02>] handle_exception+0x62/0x92
(XEN)    [<ff12fbd1>] create_pae_xen_mappings+0x142/0x290
(XEN)    [<ff1325c7>] mod_l3_entry+0x813/0x878
(XEN)    [<ff135204>] do_mmu_update+0x582/0xbe3
(XEN)    [<ff17d9f5>] hypercall+0x95/0xb5
(XEN)
(XEN) Faulting linear address: fedab018
(XEN) Pagetable walk from fedab018:

From examining the code, the
create_pae_xen_mappings()+142 location corresponds to the
first store instruction of the l2write() call
in the first loop in this bit of create_pae_xen_mappings()

    /* Xen private mappings. */
    pl2e = map_domain_page(l3e_get_pfn(l3e3));
    memcpy(&pl2e[L2_PAGETABLE_FIRST_XEN_SLOT & (L2_PAGETABLE_ENTRIES-1)],
           &idle_pg_table_l2[L2_PAGETABLE_FIRST_XEN_SLOT],
           L2_PAGETABLE_XEN_SLOTS * sizeof(l2_pgentry_t));
    for ( i = 0; i < PDPT_L2_ENTRIES; i++ )
    {
        l2e = l2e_from_page(
            virt_to_page(page_get_owner(page)->arch.mm_perdomain_pt) + i,
            __PAGE_HYPERVISOR);
        l2e_write(&pl2e[l2_table_offset(PERDOMAIN_VIRT_START) + i], l2e);
    }
    for ( i = 0; i < (LINEARPT_MBYTES >> (L2_PAGETABLE_SHIFT - 20)); i++ )
    {
        l2e = l2e_empty();
        if ( l3e_get_flags(pl3e[i]) & _PAGE_PRESENT )
            l2e = l2e_from_pfn(l3e_get_pfn(pl3e[i]), __PAGE_HYPERVISOR);
        l2e_write(&pl2e[l2_table_offset(LINEAR_PT_VIRT_START) + i], l2e);
    }
    unmap_domain_page(pl2e);

That l2e_write() essentially expands to:

    u32 *__ptep_words = (u32 *)(ptep);                        \
    __ptep_words[0] = 0;                                      \
    wmb();                                                    \
    __ptep_words[1] = (pte) >> 32;                            \
    wmb();                                                    \
    __ptep_words[0] = (pte) >>  0;                            \

and it's the mov instruction that does __ptep_words[0] = 0;
which pagefaults.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.