[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] [RFC Patch v2 00/16] COarse-grain LOck-stepping Virtual Machines for Non-stop Service
Virtual machine (VM) replication is a well known technique for providing application-agnostic software-implemented hardware fault tolerance - "non-stop service". Currently, remus provides this function, but it buffers all output packets, and the latency is unacceptable. In xen summit 2012, We introduce a new VM replication solution: colo (COarse-grain LOck-stepping virtual machine). The presentation is in the following URL: http://www.slideshare.net/xen_com_mgr/colo-coarsegrain-lockstepping-virtual-machines-for-nonstop-service Here is the summary of the solution: >From the client's point of view, as long as the client observes identical responses from the primary and secondary VMs, according to the service semantics, then the secondary VM(SVM) is a valid replica of the primary VM(PVM), and can successfully take over when a hardware failure of the PVM is detected. This patchset is RFC, and implements the frame of colo: 1. Both PVM and SVM are running 2. do checkpoint only when the output packets from PVM and SVM are different 3. cache write requests from SVM ChangeLog from v1 to v2: 1. update block-remus to support colo 2. split large patch to small one 3. fix some bugs 4. add a new hypercall for colo Changelog: Patch 1: optimize the dirty pages transfer speed. Patch 2-3: allow SVM running after checkpoint Patch 4-5: modification for colo on the master side(wait a new checkpoint, communicate with slaver when doing checkoint) Patch 6-7: implement colo's user interface Wen Congyang (16): xen: introduce new hypercall to reset vcpu block-remus: introduce colo mode block-remus: introduce a interface to allow the user specify which mode the backup end uses dominfo.completeRestore() will be called more than once in colo mode xc_domain_restore: introduce restore_callbacks for colo colo: implement restore_callbacks init()/free() colo: implement restore_callbacks get_page() colo: implement restore_callbacks flush_memory colo: implement restore_callbacks update_p2m() colo: implement restore_callbacks finish_restore() xc_restore: implement for colo XendCheckpoint: implement colo xc_domain_save: flush cache before calling callbacks->postcopy() add callback to configure network for colo xc_domain_save: implement save_callbacks for colo remus: implement colo mode tools/blktap2/drivers/block-remus.c | 188 ++++- tools/libxc/Makefile | 8 +- tools/libxc/xc_domain_restore.c | 264 ++++-- tools/libxc/xc_domain_restore_colo.c | 939 +++++++++++++++++++++ tools/libxc/xc_domain_save.c | 23 +- tools/libxc/xc_save_restore_colo.h | 14 + tools/libxc/xenguest.h | 51 ++ tools/libxl/Makefile | 2 +- tools/python/xen/lowlevel/checkpoint/checkpoint.c | 322 +++++++- tools/python/xen/lowlevel/checkpoint/checkpoint.h | 1 + tools/python/xen/remus/device.py | 8 + tools/python/xen/remus/image.py | 8 +- tools/python/xen/remus/save.py | 13 +- tools/python/xen/xend/XendCheckpoint.py | 127 ++- tools/python/xen/xend/XendDomainInfo.py | 13 +- tools/remus/remus | 28 +- tools/xcutils/Makefile | 4 +- tools/xcutils/xc_restore.c | 36 +- xen/arch/x86/domain.c | 57 ++ xen/arch/x86/x86_64/entry.S | 4 + xen/include/public/xen.h | 1 + 21 files changed, 1947 insertions(+), 164 deletions(-) create mode 100644 tools/libxc/xc_domain_restore_colo.c create mode 100644 tools/libxc/xc_save_restore_colo.h -- 1.7.4 _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |