Re: [Xen-devel] EFI Xen unstable crashes on Dell E6410 when calling efi_get_time.

>>> On 22.10.14 at 11:45, <andrew.cooper3@xxxxxxxxxx> wrote:
> On 22/10/14 01:29, Marcos E. Matsunaga wrote:
>> I went out and got the serial cable. Attached is the full output.
>> On 10/21/2014 05:06 PM, Marcos E. Matsunaga wrote:
>>> Folks,
>>> I am trying to boot Xen using efibootmgr on a Dell E6410 laptop with
>>> 4GB RAM, running an Intel I5 dual core with VT and all the
>>> virtualization options enabled.
>>> It crashes almost immediately. I am working on getting the serial
>>> console up so that I can get a more detailed stack.
>>> A screenshot of the console is attached.
>>> The xen.cfg file is:
>>> [global]
>>> default=xen
>>> [xen]
>>> options=console=vga,com1 com1=115200,8n1 dom0_max_vcpus=2 vga="qxl"
>>> kernel=vmlinuz-3.8.13-48.el7uek.Other_EFI_v1.x86_64
>>> root=UUID=917bfc7f-8d9c-4acf-a98a-a9f558daccf2  ro console=hvc0
>>> enforcing=0 biosdevname=0 earlyprintk=xen nomodeset
>>> ramdisk=initramfs-3.8.13-48.el7uek.Other_EFI_v1.x86_64.img
>>> The codepath is "(gdb) x/20i get_cmos_time
>>>    0xffff82d080188825 <get_cmos_time>:  push   %rbp
>>>    0xffff82d080188826 <get_cmos_time+1>:        mov %rsp,%rbp
>>>    0xffff82d080188829 <get_cmos_time+4>:        push   %r12
>>>    0xffff82d08018882b <get_cmos_time+6>:        push   %rbx
>>>    0xffff82d08018882c <get_cmos_time+7>:        cmpb
>>> $0x0,0xb620d(%rip)        # 0xffff82d08023ea40 <efi_enabled>
>>>    0xffff82d080188833 <get_cmos_time+14>:       je 0xffff82d080188843
>>> <get_cmos_time+30>
>>>    0xffff82d080188835 <get_cmos_time+16>:       callq
>>> 0xffff82d080100069 <efi_get_time>"
> Ok - there are two separate bugs here.
> The first is that we call into the efi runtime via efi_rs->GetTime, and
> a pagefault happens for the instruction at 0x00000000db25a33d for the
> virtual address 0x00000000fed1f410
> The memory map looks quite weird, but the faulting address is covered in
> this range.
> (XEN)  00000fed1c000-00000fed1ffff type=11 attr=8000000000000000
> So I would expect it to be mapped into the EFI pagetables.

Then you must have missed

(XEN) Unknown cachability for MFNs 0xfed1c-0xfed1f

which means no mapping got established (as we don't know what
cachability attributes to give to it).

This is a firmware bug.

> The EFI code is a mix of #ifdefs.  Can you confirm whether you are
> compiling with USE_SET_VIRTUAL_ADDRESS_MAP or not?

No-one should be altering these, their presence is purely for
documentation purposes.

> The second is that once the pagefault has happened, we trap back into
> Xen and attempt to do a pagetable walk, falling over an assertion in
> map_domain_page().
> For EFI calls, we run on the efi pagetables, not the idle pagetables, so
> I am not surprised that the assertion has failed.  I suspect that the
> pagefault hander for hypervisor faults needs to become wise to the fact
> that we may receive a fault when calling into the firmware.  As all the
> efi pagetables are xenheap pages, there is nothing conceptually wrong
> with using map_domain_page() to do the walk.

I'm not sure it's worth taking care of this special case. But yes, if
we really want to, extending the condition to also consider
efi_l4_pgtable would seem the right thing to do.


