Xen project Mailing List

[Xen-users] Migrate/Suspend performance of PV and HVM

From: Frederico Cerveira <frederico.cerveira@xxxxxxxxx>

Date: Sat, 7 Jul 2018 12:46:26 +0100

Delivery-date: Sat, 07 Jul 2018 11:47:40 +0000

List-id: Xen user discussion <xen-users.lists.xenproject.org>

Hello xen-users, Lately I have been playing with Remus to migrate one guest VM between two different hypervisors, but have found an obstacle that is preventing me from using HVM guests (without too much performance loss). I have tracked down the problem to the time that it takes for the guest VM to suspend. While PV guests usually suspend quite quickly, HVM guests are noticeably slower. For example, see this extract taken from the "dmesg" output of a PV guest after running "xl migrate" (Remus uses the same mechanism): [42482.409957] Freezing user space processes ... (elapsed 0.004 seconds) done. [42482.414895] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done. [42482.416827] PM: freeze of devices complete after 0.223 msecs [42482.416833] suspending xenstore... [42482.416901] PM: late freeze of devices complete after 0.059 msecs [42482.417002] PM: noirq freeze of devices complete after 0.091 msecs [42482.417102] xen:grant_table: Grant tables using version 1 layout [42482.417102] PM: noirq restore of devices complete after 0.068 msecs [42482.417102] PM: early restore of devices complete after 0.053 msecs [42482.449704] PM: restore of devices complete after 27.866 msecs [42482.449729] Restarting tasks ... done. Now compare it with the same output when using an HVM guest: [ 149.705054] Freezing user space processes ... (elapsed 0.102 seconds) done. [ 149.808076] Freezing remaining freezable tasks ... (elapsed 0.013 seconds) done. [ 149.878555] PM: freeze of devices complete after 28.919 msecs [ 149.878586] suspending xenstore... [ 149.885652] PM: late freeze of devices complete after 6.989 msecs [ 149.919128] PM: noirq freeze of devices complete after 33.403 msecs [ 149.920594] xen:events: Xen HVM callback vector for event delivery is enabled [ 149.920594] Xen Platform PCI: I/O protocol version 1 [ 149.920594] xen:grant_table: Grant tables using version 1 layout [ 149.920594] xen: --> irq=9, pirq=16 [ 149.920594] xen: --> irq=8, pirq=17 [ 149.920594] xen: --> irq=12, pirq=18 [ 149.920594] xen: --> irq=1, pirq=19 [ 149.920594] xen: --> irq=6, pirq=20 [ 149.920594] xen: --> irq=4, pirq=21 [ 149.920594] xen: --> irq=24, pirq=22 [ 149.954270] PM: noirq restore of devices complete after 29.099 msecs [ 149.957334] PM: early restore of devices complete after 2.598 msecs [ 150.006117] rtc_cmos 00:02: System wakeup disabled by ACPI [ 150.013313] PM: restore of devices complete after 50.213 msecs Comparing the time taken during the various steps of the suspend process, one can see that the HVM guest is many orders of magnitude slower than the PV guest. This takes a too big toll in my system that renders HVM a non-option for my use case (migrating while running a network-intensive workload with strict latency requirements). I am using Xen 4.10, Linux 3.5.1 (CentOS, compiled kernel) in the dom0 and Linux 4.4.0-87-generic (Ubuntu) in the guest VM. I have confirmed the same observation using another hardware (in fact, in that system HVM became even much slower than this). My questions to you are: 1) Is this normal/expected behaviour? Or may it be caused due to a configuration problem of mine? 2) How can I improve the suspend performance of the HVM guest? Thanks for your time, Frederico Cerveira _______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-users

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.