[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Problems using xl migrate

On 24/11/14 14:09, Wei Liu wrote:
> On Mon, Nov 24, 2014 at 01:13:25PM +0000, Andrew Cooper wrote:
>> On 24/11/14 11:50, George Dunlap wrote:
>>> On Mon, Nov 24, 2014 at 12:07 AM, M A Young <m.a.young@xxxxxxxxxxxx> wrote:
>>>> On Sat, 22 Nov 2014, M A Young wrote:
>>>>> While investigating a bug reported on Red Hat Bugzilla
>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1166461
>>>>> I discovered the following
>>>>> xl migrate --debug domid localhost does indeed fail for Xen 4.4 pv (the
>>>>> bug report is for Xen 4.3 hvm ) when xl migrate domid localhost works. 
>>>>> There
>>>>> are actually two issues here
>>>>> * the segfault in libxl-save-helper --restore-domain (as reported in the
>>>>> bug above) occurs if the guest memory is 1024M (on my 4G box) and is
>>>>> presumably because the allocated memory eventually runs out
>>>> I have found a bit more out about this. The segfault at at line 1378 of
>>>> tools/libxc/xc_domain_restore.c which is
>>>>                 DPRINTF("************** pfn=%lx type=%lx gotcs=%08lx "
>>>>                         "actualcs=%08lx\n", pfn, pagebuf->pfn_types[pfn],
>>>>                         csum_page(region_base + (i + curbatch)*PAGE_SIZE),
>>>>                         csum_page(buf));
>>>> and is because pfn in pagebuf->pfn_types[pfn] is beyond the end of the
>>>> array. This occurs in the verification phase.
>>>>> * the segfault doesn't occur if the guest memory is 128M, but the
>>>>> migration still fails. The first attached file contains the log from a run
>>>>> with xl -v migrate --debug domid localhost (with mfn and duplicated lines
>>>>> stripped out to make the size manageable).
>>>> The difference actually seems to be down to how active the VM is rather 
>>>> than
>>>> the memory size (my small memory test system was doing very little, my
>>>> larger system was a full OS install). In the non-segfault case the problem
>>>> was the printf and printf_info commands in the create_domain() routine in
>>>> tools/libxl/xl_cmdimpl.c . As xl migrate uses stdout to pass status 
>>>> messages
>>>> back from the restoring dom0, these commands cause an unexpected message. 
>>>> If
>>>> you move them onto stderr then the migration completes in the non-segfault
>>>> case.
>>> Good job tracking those down -- are there patches in the works?
>> The segfault for "--debug" has already been identified and a patch
>> posted by Wen Congyang
>> The call to csum_page() incorrectly calculates the offset it is supposed
>> to checksum, and wanders beyond the mapping of guest space.
>> Patch in 1409908261-18682-3-git-send-email-wency@xxxxxxxxxxxxxx
> And the said patch has been applied (3460eeb3fc2) so we're fine.

But not backported to 4.4, which is why Michael is falling over it.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.