[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Problems using xl migrate
On Mon, 24 Nov 2014, Wei Liu wrote: On Mon, Nov 24, 2014 at 01:13:25PM +0000, Andrew Cooper wrote:On 24/11/14 11:50, George Dunlap wrote:On Mon, Nov 24, 2014 at 12:07 AM, M A Young <m.a.young@xxxxxxxxxxxx> wrote:On Sat, 22 Nov 2014, M A Young wrote:While investigating a bug reported on Red Hat Bugzilla https://bugzilla.redhat.com/show_bug.cgi?id=1166461 I discovered the following xl migrate --debug domid localhost does indeed fail for Xen 4.4 pv (the bug report is for Xen 4.3 hvm ) when xl migrate domid localhost works. There are actually two issues here * the segfault in libxl-save-helper --restore-domain (as reported in the bug above) occurs if the guest memory is 1024M (on my 4G box) and is presumably because the allocated memory eventually runs outI have found a bit more out about this. The segfault at at line 1378 of tools/libxc/xc_domain_restore.c which is DPRINTF("************** pfn=%lx type=%lx gotcs=%08lx " "actualcs=%08lx\n", pfn, pagebuf->pfn_types[pfn], csum_page(region_base + (i + curbatch)*PAGE_SIZE), csum_page(buf)); and is because pfn in pagebuf->pfn_types[pfn] is beyond the end of the array. This occurs in the verification phase.* the segfault doesn't occur if the guest memory is 128M, but the migration still fails. The first attached file contains the log from a run with xl -v migrate --debug domid localhost (with mfn and duplicated lines stripped out to make the size manageable).The difference actually seems to be down to how active the VM is rather than the memory size (my small memory test system was doing very little, my larger system was a full OS install). In the non-segfault case the problem was the printf and printf_info commands in the create_domain() routine in tools/libxl/xl_cmdimpl.c . As xl migrate uses stdout to pass status messages back from the restoring dom0, these commands cause an unexpected message. If you move them onto stderr then the migration completes in the non-segfault case.Good job tracking those down -- are there patches in the works?The segfault for "--debug" has already been identified and a patch posted by Wen Congyang The call to csum_page() incorrectly calculates the offset it is supposed to checksum, and wanders beyond the mapping of guest space. Patch in 1409908261-18682-3-git-send-email-wency@xxxxxxxxxxxxxxAnd the said patch has been applied (3460eeb3fc2) so we're fine. However that doesn't fix my crash. I tried with it applied and still saw the crash. I also tried 4.5-rc1 (without XSM to avoid my other issue) and that crashed as well. Michael Young _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |