[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] xl create failure on arm64 with XEN 4.9rc6



On Fri, May 26, 2017 at 5:40 AM, Julien Grall <julien.grall@xxxxxxx> wrote:
>
>
> On 26/05/17 01:37, Feng Kan wrote:
>>
>> On Thu, May 25, 2017 at 12:56 PM, Julien Grall <julien.grall@xxxxxxx>
>> wrote:
>>>
>>> (CC toolstack maintainers)
>>>
>>> On 25/05/2017 19:58, Feng Kan wrote:
>>>>
>>>>
>>>> Hi All:
>>>
>>>
>>>
>>> Hello,
>>>
>>>> This is not specifically against the XEN 4.9. I am using 4.12rc2
>>>> kernel on arm64 platform. Started dom0 fine with ACPI enabled, but
>>>> failed when creating the domU guest. Xen is built natively on the
>>>> arm64 platform. Using the same kernel and ramdisk as dom0. Any idea as
>>>> why it is stuck here
>>>> would be greatly appreciated?
>>>
>>>
>>>
>>> The first step would to try a stable release if you can. Also, it would
>>> be
>>> useful if you provide information about the guest (i.e the configuration)
>>> and your .config for the kernel.
>>
>> I am using the default xen_defconfig in the arm64 directory.
>
>
> I am confused. There are no xen_defconfig in the arm64 directory of the
> kernel. So which one are you talking about?
Sorry, my mistake.
>
>> This is
>> very early on
>> in building the domain, would the guest configuration matter?
>
>
> The configuration of DOM0 kernel matters when you want to build the guest.
> That's why I wanted to know what options you enabled.
I see. I am using the default centos 7.2 kernel config plus enabling
the XEN option. (Attached below)
>
>>>
>>> I gave a try on Linux 4.12rc2 and I was not able to reproduce your error.
>>
>> Thanks, I started with 4.11 and work my way up. I have the same
>> problem in both cases.
>
>
> I cannot rule out a problem in your .config until you sent me the full one.
>
>
>>
>>>
>>>>
>>>> xc: error: panic: xc_dom_boot.c:178: xc_dom_boot_domU_map: failed to
>>>> mmap domU pages 0x450c2+0x2f3d [mmap, errno=22 (Invalid ar
>>>> gument), chunksize 0x1000]: Internal error
>>>> libxl: error: libxl_dom.c:679:libxl__build_dom: xc_dom_build_image
>>>> failed: Invalid argument
>>>> domainbuilder: detail: xc_dom_release: called
>>>> libxl: error: libxl_create.c:1217:domcreate_rebuild_done: Domain
>>>> 1:cannot (re-)build domain: -3
>>>> libxl: debug: libxl_domain.c:1140:devices_destroy_cb: Domain 1:Forked
>>>> pid 2477 for destroy of domain
>>>> libxl: debug: libxl_create.c:1646:do_domain_create: Domain 0:ao
>>>> 0x1ae10cb0: inprogress: poller=0x1ae10d40, flags=i
>>>> libxl: debug: libxl_event.c:1869:libxl__ao_complete: ao 0x1ae10cb0:
>>>> complete, rc=-3
>>>> libxl: debug: libxl_event.c:1838:libxl__ao__destroy: ao 0x1ae10cb0:
>>>> destroy
>>>> libxl: debug: libxl_domain.c:871:libxl_domain_destroy: Domain 1:ao
>>>> 0x1ae10cb0: create: how=(nil) callback=(nil) poller=0x1ae10d
>>>> 40
>>>>
>>>> Seem to failed when mmapping pages for the ramdisk. I did some digging
>>>> and the failure
>>>> occurs during the  IOCTL_PRIVCMD_MMAPBATCH_V2 call. It seems the
>>>> 8192's page had a err code of -22.
>>>
>>>
>>>
>>> -22 is -EINVAL. There are quite a few path return -EINVAL, did you try to
>>> narrow the failure in the kernel?
>>
>> I dug a bit deeper, in privcmd_ioctl_mmap_batch, a global error was
>> detected for
>> mmap_batch_fn during the mapping of the second half of  0x2f3d of pages. I
>> am
>> still trying to track down why it is the 8192th element that cause the
>> problem. seem
>> like too much of coincidence its the first element of the second half.
>
>
> Can you explain what you mean by "global error"?
In linux/driver/xen/privcmd.c:mmap_batch_fn()
        ret = xen_remap_domain_gfn_array(st->vma, st->va & PAGE_MASK, gfnp, nr,
                                         (int *)gfnp, st->vma->vm_page_prot,
                                         st->domain, cur_pages);

        /* Adjust the global_error? */
        if (ret != nr) {
^^^Sorry for the bad english, its more like a global_error is updated
due to the error
return.
                if (ret == -ENOENT)
                        st->global_error = -ENOENT;
                else {
                        /* Record that at least one error has happened. */
                        if (st->global_error == 0)
                                st->global_error = 1;
                }
        }



>
>>
>>>
>>>  The system have plenty of memory.
>>>>
>>>>
>>>> Afterward, a null guest is created.
>>>> As a side note, how do I get rid of it?
>>>
>>>
>>>
>>> Normally the domain should be destroyed by the tools if the building
>>> failed.
>>>
>>> You should be able to destroy it using 'xl domain domid' where domid is
>>> the
>>> domain ID of the domain. If it does not work, then it means dom0 is
>>> holding
>>> reference on some page belonging to that domain.
>>
>> Make sense, since I can rename it but destroy it still left it as null.
>
>
> This is likely because someone (such as dom0) kept a reference on a page
> belonging to the domain.
>
> Cheers,
>
> --
> Julien Grall

Attachment: xen.config
Description: Binary data

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.