[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-users] Xen 4.6 Live Migration and Hotplugging Issues



Hi,

I am trying to set up two Ubuntu 16.04 / Xen 4.6 Machines to perform live migration and CPU / memory hotplug. So far I encountered several catastrophic issues. They are so severe that I am thinking I might be on the wrong track alltogether.

Any input is highly appreciated!

The setup:

2 Dell M630 with Ubuntu 16.04 and Xen 4.6, 64bit Dom0 (node1 + node2)

2 Domus, Debian Jessie 64bit PV and Debian Jessie 64bit HVM

Now create a PV Domu on node1 with 1 CPU Core and 2 GB RAM and plenty of room for hot-add / hotplug:

Config excerpt:

kernel       = "/home/xen/shared/boot/tests/vmlinuz-3.16.0-4-amd64"
ramdisk      = "/home/xen/shared/boot/tests/initrd.img-3.16.0-4-amd64"
maxmem       = 16384
memory       = 2048
maxvcpus     = 8
vcpus        = 1
cpus         = "18"

xm list:

root1823     97  2048     1     -b----      15.1

All is fine. Now migrate to node2. Immediately after the migratiion we see:

xm list:

root182      360 16384     1     -b----      10.5

So the DomU immediately ballooned to its maxmem after the migration, and even better, inside the Domu we see all CPUs are suddenly hotplugged (but not online due to missing udev rules):

root@debian8:~# ls /sys/devices/system/cpu/ | grep cpu
cpu0
cpu1
cpu2
cpu3
cpu4
cpu5
cpu6
cpu7

So this is already not how it is supposed to be (DomU should look the same before and after migration).

Now we take cpu1 online:

echo 1 > /sys/devices/system/cpu/cpu1/online

Result as seen through hvc on the Dom0:

[  373.360949] installing Xen timer for CPU 1
[  400.032003] BUG: soft lockup - CPU#0 stuck for 22s! [bash:733]
[ 400.032003] Modules linked in: nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc evdev pcspkr x86_pkg_temp_thermal thermal_sys coretemp crc32_pclmul aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd autofs4 ext4 crc16 mbcache jbd2 crct10dif_pclmul crct10dif_common xen_netfront xen_blkfront crc32c_intel [ 400.032003] CPU: 0 PID: 733 Comm: bash Not tainted 3.16.0-4-amd64 #1 Debian 3.16.43-2+deb8u3 [ 400.032003] task: ffff88000470e1d0 ti: ffff88006acec000 task.ti: ffff88006acec000 [ 400.032003] RIP: e030:[<ffffffff810013aa>] [<ffffffff810013aa>] xen_hypercall_sched_op+0xa/0x20
[  400.032003] RSP: e02b:ffff88006acefdd0  EFLAGS: 00000246
[ 400.032003] RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffffffff810013aa [ 400.032003] RDX: ffff88007d640000 RSI: 0000000000000000 RDI: 0000000000000000 [ 400.032003] RBP: ffff88006bcf6000 R08: ffff88007d03d5c8 R09: 0000000000000122 [ 400.032003] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001 [ 400.032003] R13: 000000000000cd60 R14: ffff88006d1dca20 R15: 000000000007d649 [ 400.032003] FS: 00007fe4b215e700(0000) GS:ffff88007d600000(0000) knlGS:0000000000000000
[  400.032003] CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 400.032003] CR2: 00000000016de6d0 CR3: 0000000004a67000 CR4: 0000000000042660
[  400.032003] Stack:
[ 400.032003] ffff88006acefb3e 0000000000000000 ffffffff81010dc1 0000000001323d35 [ 400.032003] 0000000000000000 0000000000000000 0000000000000001 0000000000000001 [ 400.032003] ffff88006d1dca20 0000000000000000 ffffffff81068cac 000000306aceff3c
[  400.032003] Call Trace:
[  400.032003]  [<ffffffff81010dc1>] ? xen_cpu_up+0x211/0x500
[  400.032003]  [<ffffffff81068cac>] ? _cpu_up+0x12c/0x160
[  400.032003]  [<ffffffff81068d59>] ? cpu_up+0x79/0xa0
[  400.032003]  [<ffffffff8150b615>] ? cpu_subsys_online+0x35/0x80
[  400.032003]  [<ffffffff813a608d>] ? device_online+0x5d/0xa0
[  400.032003]  [<ffffffff813a6145>] ? online_store+0x75/0x80
[  400.032003]  [<ffffffff8121b56a>] ? kernfs_fop_write+0xda/0x150
[  400.032003]  [<ffffffff811aaf32>] ? vfs_write+0xb2/0x1f0
[  400.032003]  [<ffffffff811aba72>] ? SyS_write+0x42/0xa0
[ 400.032003] [<ffffffff8151a48d>] ? system_call_fast_compare_end+0x10/0x15 [ 400.032003] Code: cc 51 41 53 b8 1c 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 1d 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc

The same happens on the HVM DomU but always only _after_ live migration. Hotplugging works flawlessly if done on the Dom0 where the DomU is started on.

Any idea what might be happening here? Anyone who has managed to migrate and afterwards hotplug a DomU?

Thanks

Tim

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
https://lists.xen.org/xen-users

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.