[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[qemu-xen master] virtio-balloon: don't start free page hinting if postcopy is possible



commit aa77e375a5b1b91d7646d8b5f8683778f53fbbd3
Author:     David Hildenbrand <david@xxxxxxxxxx>
AuthorDate: Thu Jul 8 11:53:38 2021 +0200
Commit:     Michael Roth <michael.roth@xxxxxxx>
CommitDate: Tue Dec 14 08:56:18 2021 -0600

    virtio-balloon: don't start free page hinting if postcopy is possible
    
    Postcopy never worked properly with 'free-page-hint=on', as there are
    at least two issues:
    
    1) With postcopy, the guest will never receive a VIRTIO_BALLOON_CMD_ID_DONE
       and consequently won't release free pages back to the OS once
       migration finishes.
    
       The issue is that for postcopy, we won't do a final bitmap sync while
       the guest is stopped on the source and
       virtio_balloon_free_page_hint_notify() will only call
       virtio_balloon_free_page_done() on the source during
       PRECOPY_NOTIFY_CLEANUP, after the VM state was already migrated to
       the destination.
    
    2) Once the VM touches a page on the destination that has been excluded
       from migration on the source via qemu_guest_free_page_hint() while
       postcopy is active, that thread will stall until postcopy finishes
       and all threads are woken up. (with older Linux kernels that won't
       retry faults when woken up via userfaultfd, we might actually get a
       SEGFAULT)
    
       The issue is that the source will refuse to migrate any pages that
       are not marked as dirty in the dirty bmap -- for example, because the
       page might just have been sent. Consequently, the faulting thread will
       stall, waiting for the page to be migrated -- which could take quite
       a while and result in guest OS issues.
    
    While we could fix 1) comparatively easily, 2) is harder to get right and
    might require more involved RAM migration changes on source and destination
    [1].
    
    As it never worked properly, let's not start free page hinting in the
    precopy notifier if the postcopy migration capability was enabled to fix
    it easily. Capabilities cannot be enabled once migration is already
    running.
    
    Note 1: in the future we might either adjust migration code on the source
            to track pages that have actually been sent or adjust
            migration code on source and destination  to eventually send
            pages multiple times from the source and and deal with pages
            that are sent multiple times on the destination.
    
    Note 2: virtio-mem has similar issues, however, access to "unplugged"
            memory by the guest is very rare and we would have to be very
            lucky for it to happen during migration. The spec states
            "The driver SHOULD NOT read from unplugged memory blocks ..."
            and "The driver MUST NOT write to unplugged memory blocks".
            virtio-mem will move away from virtio_balloon_free_page_done()
            soon and handle this case explicitly on the destination.
    
    [1] 
https://lkml.kernel.org/r/e79fd18c-aa62-c1d8-c7f3-ba3fc2c25fc8@xxxxxxxxxx
    
    Fixes: c13c4153f76d ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT")
    Cc: qemu-stable@xxxxxxxxxx
    Cc: Wei Wang <wei.w.wang@xxxxxxxxx>
    Cc: Michael S. Tsirkin <mst@xxxxxxxxxx>
    Cc: Philippe Mathieu-Daudé <philmd@xxxxxxxxxx>
    Cc: Alexander Duyck <alexander.duyck@xxxxxxxxx>
    Cc: Juan Quintela <quintela@xxxxxxxxxx>
    Cc: "Dr. David Alan Gilbert" <dgilbert@xxxxxxxxxx>
    Cc: Peter Xu <peterx@xxxxxxxxxx>
    Signed-off-by: David Hildenbrand <david@xxxxxxxxxx>
    Message-Id: <20210708095339.20274-2-david@xxxxxxxxxx>
    Reviewed-by: Michael S. Tsirkin <mst@xxxxxxxxxx>
    Signed-off-by: Michael S. Tsirkin <mst@xxxxxxxxxx>
    Reviewed-by: Peter Xu <peterx@xxxxxxxxxx>
    (cherry picked from commit fd51e54fa10221e5a8add894c38cc1cf199f4bc4)
    Signed-off-by: Michael Roth <michael.roth@xxxxxxx>
---
 hw/virtio/virtio-balloon.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 4b5d9e5e50..ae7867a8db 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -30,6 +30,7 @@
 #include "trace.h"
 #include "qemu/error-report.h"
 #include "migration/misc.h"
+#include "migration/migration.h"
 
 #include "hw/virtio/virtio-bus.h"
 #include "hw/virtio/virtio-access.h"
@@ -662,6 +663,18 @@ virtio_balloon_free_page_hint_notify(NotifierWithReturn 
*n, void *data)
         return 0;
     }
 
+    /*
+     * Pages hinted via qemu_guest_free_page_hint() are cleared from the dirty
+     * bitmap and will not get migrated, especially also not when the postcopy
+     * destination starts using them and requests migration from the source; 
the
+     * faulting thread will stall until postcopy migration finishes and
+     * all threads are woken up. Let's not start free page hinting if postcopy
+     * is possible.
+     */
+    if (migrate_postcopy_ram()) {
+        return 0;
+    }
+
     switch (pnd->reason) {
     case PRECOPY_NOTIFY_BEFORE_BITMAP_SYNC:
         virtio_balloon_free_page_stop(dev);
--
generated by git-patchbot for /home/xen/git/qemu-xen.git#master



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.