[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-users] Xen 4.10: domU crashes during/after live-migrate



Hi all,

We (at Mendix) are upgrading our dom0s to Xen 4.10 (PV) running on Debian
Stretch (Linux 4.9), but we are running into an issue regarding live-migration.

We are experiencing domU crashes while live-migrating and in the seconds after
the live-migration has been completed. This doesn't happen all the time. But we
are able to reproduce the issue within 1 to max 10 times live migrating between
2 dom0s.

We've reproduced this so far with domUs running Linux 4.9.82-1+deb9u3 (Debian
Stretch) and 4.15.11-1 (Debian Buster).

Attached are all kernel traces, oopses per crash that are logged from the domUs
and retrieved via "xen console" in the seconds after the live-migration is
completed. In some cases the domU keeps on running or being visible via "xen
list", in other cases the domU disappears from "xen list" after a short amount
of time.

>From the logging in our dom0s in most cases everything looks fine:

Apr 12 16:58:20 altair socat[738]: migration target: Ready to receive domain.
Apr 12 16:58:20 altair socat[738]: Loading new save file <incoming migration 
stream> (new xl fmt info 0x3/0x0/1250)
Apr 12 16:58:20 altair socat[738]:  Savefile contains xl domain config in JSON 
format
Apr 12 16:58:20 altair socat[738]: Parsing config from <saved>
Apr 12 16:58:20 altair socat[738]: libxl: info: 
libxl_create.c:109:libxl__domain_build_info_setdefault: qemu-xen is 
unavailable, using q
Apr 12 16:58:20 altair socat[738]: xc: info: Found x86 PV domain from Xen 4.10
Apr 12 16:58:20 altair socat[738]: xc: info: Restoring domain
Apr 12 16:58:28 altair socat[738]: xc: info: Restore successful
Apr 12 16:58:28 altair socat[738]: xc: info: XenStore: mfn 0xce734b, dom 0, evt 
1
Apr 12 16:58:28 altair socat[738]: xc: info: Console: mfn 0xce734c, dom 0, evt 2

.. but 1 second later the domU gets a kernel panic (see attachment oops-1.txt).

There are cases where the dom0 logs a failure. After this failure the domU 
disappeared:

Apr 12 14:17:55 altair socat[738]: migration target: Ready to receive domain.
Apr 12 14:17:55 altair socat[738]: Loading new save file <incoming migration 
stream> (new xl fmt info 0x3/0x0/1250)
Apr 12 14:17:55 altair socat[738]:  Savefile contains xl domain config in JSON 
format
Apr 12 14:17:55 altair socat[738]: Parsing config from <saved>
Apr 12 14:17:55 altair socat[738]: libxl: info: 
libxl_create.c:109:libxl__domain_build_info_setdefault: qemu-xen is 
unavailable, using qemu-xen-traditional instead: No such file or directory
Apr 12 14:17:55 altair socat[738]: xc: info: Found x86 PV domain from Xen 4.10
Apr 12 14:17:55 altair socat[738]: xc: info: Restoring domain
Apr 12 14:18:00 altair socat[738]: libxl-save-helper: xc_sr_restore_x86_pv.c:7: 
pfn_to_mfn: Assertion `pfn <= ctx->x86_pv.max_pfn' failed.
Apr 12 14:18:00 altair socat[738]: libxl: error: 
libxl_utils.c:510:libxl_read_exactly: file/stream truncated reading ipc msg 
header from domain 7 save/restore helper stdout pipe
Apr 12 14:18:00 altair socat[738]: libxl: error: 
libxl_exec.c:129:libxl_report_child_exitstatus: domain 7 save/restore helper 
[18962] died due to fatal signal Aborted
Apr 12 14:18:00 altair socat[738]: libxl: error: 
libxl_create.c:1264:domcreate_rebuild_done: Domain 7:cannot (re-)build domain: 
-3
Apr 12 14:18:00 altair socat[738]: libxl: error: 
libxl_domain.c:1000:libxl__destroy_domid: Domain 7:Non-existant domain
Apr 12 14:18:00 altair socat[738]: libxl: error: 
libxl_domain.c:959:domain_destroy_callback: Domain 7:Unable to destroy guest
Apr 12 14:18:00 altair socat[738]: libxl: error: 
libxl_domain.c:886:domain_destroy_cb: Domain 7:Destruction of domain failed
Apr 12 14:18:00 altair socat[738]: migration target: Domain creation failed 
(code -3).
Apr 12 14:18:00 altair socat[18950]: E write(5, 0x559e0ffc85c0, 8192): Broken 
pipe

And in this case the domU was running on the destination dom0, but it crashed
immediately (see attachment oops-2.txt).

Apr 12 14:44:24 rho socat[725]: migration target: Ready to receive domain.
Apr 12 14:44:24 rho socat[725]: Loading new save file <incoming migration 
stream> (new xl fmt info 0x3/0x0/1250)
Apr 12 14:44:24 rho socat[725]:  Savefile contains xl domain config in JSON 
format
Apr 12 14:44:24 rho socat[725]: Parsing config from <saved>
Apr 12 14:44:24 rho socat[725]: libxl: info: 
libxl_create.c:109:libxl__domain_build_info_setdefault: qemu-xen is 
unavailable, using qemu-xen-traditional instead: No such file or directory
Apr 12 14:44:24 rho socat[725]: xc: info: Found x86 PV domain from Xen 4.10
Apr 12 14:44:24 rho socat[725]: xc: info: Restoring domain
Apr 12 14:45:31 rho socat[725]: xc: error: Failed to read Record Header from 
stream (0 = Success): Internal error
Apr 12 14:45:31 rho socat[725]: xc: error: Restore failed (0 = Success): 
Internal error
Apr 12 14:45:31 rho socat[725]: libxl: error: 
libxl_stream_read.c:850:libxl__xc_domain_restore_done: restoring domain: Success
Apr 12 14:45:31 rho socat[725]: libxl: error: 
libxl_create.c:1264:domcreate_rebuild_done: Domain 11:cannot (re-)build domain: 
-3
Apr 12 14:45:31 rho socat[725]: libxl: error: 
libxl_domain.c:1000:libxl__destroy_domid: Domain 11:Non-existant domain
Apr 12 14:45:31 rho socat[725]: libxl: error: 
libxl_domain.c:959:domain_destroy_callback: Domain 11:Unable to destroy guest
Apr 12 14:45:31 rho socat[725]: libxl: error: 
libxl_domain.c:886:domain_destroy_cb: Domain 11:Destruction of domain failed
Apr 12 14:45:31 rho socat[725]: migration target: Domain creation failed (code 
-3).

We have been running Xen 4.4 on Debian Jessie (Linux 3.16.51-3+deb8u1) on the
same hardware flawlessly for the past years.

Does anyone have similar experiences with Xen 4.10? How can we help debugging
and finding the cause of these issues?

Thanks!

-- 
Pim van den Berg

Attachment: oops-1.txt
Description: Text document

Attachment: oops-2.txt
Description: Text document

Attachment: oops-3.txt
Description: Text document

Attachment: oops-4.txt
Description: Text document

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-users

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.