[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] [PATCH RFC 00/20] Add postcopy live migration support
Hi, We're a team of three fourth-year undergraduate software engineering students at the University of Waterloo in Canada. In late 2015 we posted on the list [1] to ask for a project to undertake for our program's capstone design project, and Andrew Cooper pointed us in the direction of the live migration implementation as an area that could use some attention. We were particularly interested in post-copy live migration (as evaluated by [2] and discussed on the list at [3]), and have been working on an implementation of this on-and-off since then. We now have a working implementation of this scheme, and are submitting it for comment. The changes are also available as the 'postcopy' branch of the GitHub repository at [4] As a brief overview of our approach: - We introduce a mechanism by which libxl can indicate to the libxc stream helper process that the iterative migration precopy loop should be terminated and postcopy should begin. - At this point, we suspend the domain, collect the final set of dirty pfns and write these pfns (and _not_ their contents) into the stream. - At the destination, the xc restore logic registers itself as a pager for the migrating domain, 'evicts' all of the pfns indicated by the sender as outstanding, and then resumes the domain at the destination. - As the domain executes, the migration sender continues to push the remaining oustanding pages to the receiver in the background. The receiver monitors both the stream for incoming page data and the paging ring event channel for page faults triggered by the guest. Page faults are forwarded on the back-channel migration stream to the migration sender, which prioritizes these pages for transmission. By leveraging the existing paging API, we are able to implement the postcopy scheme without any hypervisor modifications - all of our changes are confined to the userspace toolstack. However, we inherit from the paging API the requirement that the domains be HVM and that the host have HAP/EPT support. We haven't yet had the opportunity to perform a quantitative evaluation of the performance trade-offs between the traditional pre-copy and our post-copy strategies, but intend to. Informally, we've been testing our implementation by migrating a domain running the x86 memtest program (which is obviously a tremendously write-heavy workload), and have observed a substantial reduction in total time required for migration completion (at the expense of a visually obvious 'slowdown' in the execution of the program). We've also noticed that, when performing a postcopy without any leading precopy iterations, the time required at the destination to 'evict' all of the outstanding pages is substantial - possibly because there is no batching mechanism by which pages can be evicted - so this area in particular might require further attention. We're really interested in any feedback you might have! Thanks! Harley Armstrong, Chester Lin, Joshua Otto [1] https://lists.gt.net/xen/devel/410255 [2] http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.184.2368 [3] https://lists.gt.net/xen/devel/261568 [4] https://github.com/jtotto/xen Joshua Otto (20): tools: rename COLO 'postcopy' to 'aftercopy' libxc/xc_sr: parameterise write_record() on fd libxc/xc_sr_restore.c: use write_record() in send_checkpoint_dirty_pfn_list() libxc/xc_sr_save.c: add WRITE_TRIVIAL_RECORD_FN() libxc/xc_sr: factor out filter_pages() libxc/xc_sr: factor helpers out of handle_page_data() migration: defer precopy policy to libxl libxl/migration: add precopy tuning parameters libxc/xc_sr_save: introduce save batch types libxc/xc_sr_save.c: initialise rec.data before free() libxc/migration: correct hvm record ordering specification libxc/migration: specify postcopy live migration libxc/migration: add try_read_record() libxc/migration: implement the sender side of postcopy live migration libxc/migration: implement the receiver side of postcopy live migration libxl/libxl_stream_write.c: track callback chains with an explicit phase libxl/libxl_stream_read.c: track callback chains with an explicit phase libxl/migration: implement the sender side of postcopy live migration libxl/migration: implement the receiver side of postcopy live migration tools: expose postcopy live migration support in libxl and xl docs/specs/libxc-migration-stream.pandoc | 184 ++++- docs/specs/libxl-migration-stream.pandoc | 19 +- tools/libxc/include/xenguest.h | 170 ++-- tools/libxc/xc_nomigrate.c | 3 +- tools/libxc/xc_private.c | 21 +- tools/libxc/xc_private.h | 2 + tools/libxc/xc_sr_common.c | 118 ++- tools/libxc/xc_sr_common.h | 152 +++- tools/libxc/xc_sr_common_x86.c | 2 +- tools/libxc/xc_sr_restore.c | 1297 +++++++++++++++++++++++++----- tools/libxc/xc_sr_restore_x86_hvm.c | 38 +- tools/libxc/xc_sr_save.c | 828 +++++++++++++++---- tools/libxc/xc_sr_save_x86_hvm.c | 18 +- tools/libxc/xc_sr_save_x86_pv.c | 17 +- tools/libxc/xc_sr_stream_format.h | 15 +- tools/libxc/xg_save_restore.h | 16 +- tools/libxl/libxl.h | 44 +- tools/libxl/libxl_colo_restore.c | 2 +- tools/libxl/libxl_colo_save.c | 2 +- tools/libxl/libxl_create.c | 167 +++- tools/libxl/libxl_dom_save.c | 55 +- tools/libxl/libxl_domain.c | 41 +- tools/libxl/libxl_internal.h | 79 +- tools/libxl/libxl_remus.c | 2 +- tools/libxl/libxl_save_callout.c | 3 +- tools/libxl/libxl_save_helper.c | 7 +- tools/libxl/libxl_save_msgs_gen.pl | 10 +- tools/libxl/libxl_sr_stream_format.h | 13 +- tools/libxl/libxl_stream_read.c | 136 +++- tools/libxl/libxl_stream_write.c | 161 ++-- tools/ocaml/libs/xl/xenlight_stubs.c | 2 +- tools/xl/xl.h | 7 +- tools/xl/xl_cmdtable.c | 25 +- tools/xl/xl_migrate.c | 85 +- tools/xl/xl_vmcontrol.c | 8 +- 35 files changed, 3144 insertions(+), 605 deletions(-) -- 2.7.4 _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |