[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-users] Xen 4.6 Live Migration and Hotplugging Issues
Hi,I am trying to set up two Ubuntu 16.04 / Xen 4.6 Machines to perform live migration and CPU / memory hotplug. So far I encountered several catastrophic issues. They are so severe that I am thinking I might be on the wrong track alltogether. Any input is highly appreciated! The setup: 2 Dell M630 with Ubuntu 16.04 and Xen 4.6, 64bit Dom0 (node1 + node2) 2 Domus, Debian Jessie 64bit PV and Debian Jessie 64bit HVMNow create a PV Domu on node1 with 1 CPU Core and 2 GB RAM and plenty of room for hot-add / hotplug: Config excerpt: kernel = "/home/xen/shared/boot/tests/vmlinuz-3.16.0-4-amd64" ramdisk = "/home/xen/shared/boot/tests/initrd.img-3.16.0-4-amd64" maxmem = 16384 memory = 2048 maxvcpus = 8 vcpus = 1 cpus = "18" xm list: root1823 97 2048 1 -b---- 15.1 All is fine. Now migrate to node2. Immediately after the migratiion we see: xm list: root182 360 16384 1 -b---- 10.5So the DomU immediately ballooned to its maxmem after the migration, and even better, inside the Domu we see all CPUs are suddenly hotplugged (but not online due to missing udev rules): root@debian8:~# ls /sys/devices/system/cpu/ | grep cpu cpu0 cpu1 cpu2 cpu3 cpu4 cpu5 cpu6 cpu7So this is already not how it is supposed to be (DomU should look the same before and after migration). Now we take cpu1 online: echo 1 > /sys/devices/system/cpu/cpu1/online Result as seen through hvc on the Dom0: [ 373.360949] installing Xen timer for CPU 1 [ 400.032003] BUG: soft lockup - CPU#0 stuck for 22s! [bash:733][ 400.032003] Modules linked in: nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc evdev pcspkr x86_pkg_temp_thermal thermal_sys coretemp crc32_pclmul aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd autofs4 ext4 crc16 mbcache jbd2 crct10dif_pclmul crct10dif_common xen_netfront xen_blkfront crc32c_intel [ 400.032003] CPU: 0 PID: 733 Comm: bash Not tainted 3.16.0-4-amd64 #1 Debian 3.16.43-2+deb8u3 [ 400.032003] task: ffff88000470e1d0 ti: ffff88006acec000 task.ti: ffff88006acec000 [ 400.032003] RIP: e030:[<ffffffff810013aa>] [<ffffffff810013aa>] xen_hypercall_sched_op+0xa/0x20 [ 400.032003] RSP: e02b:ffff88006acefdd0 EFLAGS: 00000246[ 400.032003] RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffffffff810013aa [ 400.032003] RDX: ffff88007d640000 RSI: 0000000000000000 RDI: 0000000000000000 [ 400.032003] RBP: ffff88006bcf6000 R08: ffff88007d03d5c8 R09: 0000000000000122 [ 400.032003] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001 [ 400.032003] R13: 000000000000cd60 R14: ffff88006d1dca20 R15: 000000000007d649 [ 400.032003] FS: 00007fe4b215e700(0000) GS:ffff88007d600000(0000) knlGS:0000000000000000 [ 400.032003] CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033[ 400.032003] CR2: 00000000016de6d0 CR3: 0000000004a67000 CR4: 0000000000042660 [ 400.032003] Stack:[ 400.032003] ffff88006acefb3e 0000000000000000 ffffffff81010dc1 0000000001323d35 [ 400.032003] 0000000000000000 0000000000000000 0000000000000001 0000000000000001 [ 400.032003] ffff88006d1dca20 0000000000000000 ffffffff81068cac 000000306aceff3c [ 400.032003] Call Trace: [ 400.032003] [<ffffffff81010dc1>] ? xen_cpu_up+0x211/0x500 [ 400.032003] [<ffffffff81068cac>] ? _cpu_up+0x12c/0x160 [ 400.032003] [<ffffffff81068d59>] ? cpu_up+0x79/0xa0 [ 400.032003] [<ffffffff8150b615>] ? cpu_subsys_online+0x35/0x80 [ 400.032003] [<ffffffff813a608d>] ? device_online+0x5d/0xa0 [ 400.032003] [<ffffffff813a6145>] ? online_store+0x75/0x80 [ 400.032003] [<ffffffff8121b56a>] ? kernfs_fop_write+0xda/0x150 [ 400.032003] [<ffffffff811aaf32>] ? vfs_write+0xb2/0x1f0 [ 400.032003] [<ffffffff811aba72>] ? SyS_write+0x42/0xa0[ 400.032003] [<ffffffff8151a48d>] ? system_call_fast_compare_end+0x10/0x15 [ 400.032003] Code: cc 51 41 53 b8 1c 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 1d 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc The same happens on the HVM DomU but always only _after_ live migration. Hotplugging works flawlessly if done on the Dom0 where the DomU is started on. Any idea what might be happening here? Anyone who has managed to migrate and afterwards hotplug a DomU? Thanks Tim _______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxx https://lists.xen.org/xen-users
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |