[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Kernel panic with tboot E820_UNUSABLE region



On 14/05/13 14:53, David Vrabel wrote:
> On 14/05/13 12:06, Aurelien Chartier wrote:
>> Hi everybody,
>>
>> We noticed a crash in Linux dom0 early boot sequence when running over
>> tboot and Xen. The issue seemed related with a E820 region that tboot is
>> setting as E820_UNUSABLE. We posted to tboot-devel to understand better
>> what could be the cause of the kernel panic. This thread can be read
>> here :
>> http://sourceforge.net/mailarchive/forum.php?thread_name=51852B26.7070406%40citrix.com&forum_name=tboot-devel
>>
>> Following Konrad's advice, we took a closer look at arch/x86/xen/setup.c
>> and found what could be the cause of the kernel panic. I am not familiar
>> with that part of Xen, so feel free to correct me.
>>
>> The Xen memory setup code called during early boot is trying to release
>> chunks of memory in xen_set_identity_and_release for non-RAM regions
>> (including E820_UNUSABLE). The xen_set_identity_and_release_chunk
>> function is calling HYPERVISOR_update_va_mapping, which will fail in our
>> case. As tboot marked that region as being unusable, Xen did not map
>> those pages and the later call on get_page_from_l1e (arch/x86/mm.c in
>> Xen code) is returning an error.  As the return value of the hypercall
>> is not checked in Linux code, xen_set_identity_and_release_chunk
>> function is carrying on and tries to release the E820_UNUSABLE chunk.
>> This is apparently messing up some Xen internal memory structures,
>> resulting in a kernel crash when Linux is initializing its memory mapping.
> That does not sound quite right to me.  xen_set_identity_and_release()
> is releasing RAM pfns that overlap with holes in the machine memory map
> and get_page_from_l1e() should always succeed.  The fact that they're
> overlapping with something marked as UNUSABLE shouldn't matter since its
> no different from any other of the holes.
get_page_from_l1e() is failing as the call to
page_get_owner_and_reference() is returning NULL. This was caused by
page->count_info being equal to 0, which means the page has not been
allocated, according to the comment in that function. I assume this page
has not been allocated because it is located in a region marked as
E820_UNUSABLE by tboot.
>
> Is tboot causing Xen to do something weird like leaving holes in dom0's
> initial memory allocation?

Yes, there are pfns linked to pages that are not allocated.

The E820 map seen by Xen is :

(XEN) Multiboot-e820 RAM map:
(XEN)  0000000000000000 - 0000000000060000 (usable)
(XEN)  0000000000060000 - 0000000000068000 (reserved)
(XEN)  0000000000068000 - 0000000000091800 (usable)
(XEN)  0000000000091800 - 00000000000a0000 (reserved)
(XEN)  00000000000e0000 - 0000000000100000 (reserved)
(XEN)  0000000000100000 - 0000000000800000 (usable)
(XEN)  0000000000800000 - 0000000000975000 (unusable)
(XEN)  0000000000975000 - 0000000020000000 (usable)
(XEN)  0000000020000000 - 0000000020200000 (reserved)
(XEN)  0000000020200000 - 0000000040004000 (usable)
(XEN)  0000000040004000 - 0000000040005000 (reserved)
(XEN)  0000000040005000 - 0000000060000000 (usable)
(XEN)  0000000060000000 - 0000000060200000 (reserved)
(XEN)  0000000060200000 - 00000000c5e00000 (usable)
(XEN)  00000000c5e00000 - 00000000c86f8000 (reserved)
(XEN)  00000000c86f8000 - 00000000c8800000 (ACPI NVS)
(XEN)  00000000c8800000 - 00000000ca3c5000 (reserved)
(XEN)  00000000ca3c5000 - 00000000ca408000 (ACPI NVS)
(XEN)  00000000ca408000 - 00000000cac00000 (reserved)
(XEN)  00000000caf00000 - 00000000cb000000 (reserved)
(XEN)  00000000cb800000 - 00000000cfa00000 (reserved)
(XEN)  00000000f8000000 - 00000000fc000000 (reserved)
(XEN)  00000000fec00000 - 00000000fec01000 (reserved)
(XEN)  00000000fed00000 - 00000000fed04000 (reserved)
(XEN)  00000000fed1c000 - 00000000fed30000 (reserved)
(XEN)  00000000fee00000 - 00000000fee01000 (reserved)
(XEN)  00000000ff000000 - 0000000100000000 (reserved)
(XEN)  0000000100000000 - 000000012e600000 (usable)


The region 0000000000975000 - 0000000020000000 has been set to unusable
by tboot.

Calls to update_va_mapping show the following error messages (with mfn
going from 800 to 974):

(XEN) mm.c:911:d0 Error getting mfn 800 (pfn 5555555555555555) from L1
entry 000
0000000800463 for l1e_owner=0, pg_owner=0


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.