[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 00 of 10] Teach xm save to checkpoint a
Brendan: Hi, my name is Yoshi Tamura, working for NTT Labs in Japan. I tried your patches, and I liked your new feature to checkpoint a running domain.I also tried your patches for live migration, but xc_linux_restore() on the remote machine failed. I track downed the problem and fixed it by modifying __xen_checkpoint() in machine_reboot.c. Take a look at the following patch. As far as I have tested, it works for both xm save -c and xm migrate –live. Let me know if you have any comments or better idea. Regards, Yoshi Tamura Signed-off-by: Yoshi Tamura <tamura.yoshiaki@xxxxxxxxxxxxx> diff -r 3bde632518a4 linux-2.6-xen-sparse/drivers/xen/core/machine_reboot.c--- a/linux-2.6-xen-sparse/drivers/xen/core/machine_reboot.c Thu Dec 14 23:05:42 2006 -0800 +++ b/linux-2.6-xen-sparse/drivers/xen/core/machine_reboot.c Wed Dec 20 16:21:43 2006 +0900 @@ -171,8 +171,6 @@ int __xen_suspend(void) pre_suspend(); - gnttab_checkpoint(); - /* * We'll stop somewhere inside this hypercall. When it returns, * we'll start resuming after the restore. @@ -223,6 +221,8 @@ int __xen_checkpoint(void) xenbus_lock(); + gnttab_suspend(); + preempt_disable(); mm_pin_all(); @@ -257,6 +257,8 @@ int __xen_checkpoint(void) } else { post_checkpoint(); + gnttab_resume(); + local_irq_enable(); xenbus_unlock(); Brendan Cully wrote: I think maybe I forgot to mention that I have successfully checkpointed domains and restored them from checkpoints (with file-system activity between checkpoints). It seems to work pretty well. I'll try to put together a demo of this next week. Regarding full device disconnection, my understanding is that guest domains are already prepared to deal with back-end driver crashes (by maintaining shadows of the ring etc), so a forced reconnect on resume should be able to recover even if there wasn't an orderly shutdown before the suspend. I thought when I looked over the code that the reconnect path did a paranoid forced disconnect first anyway (eg checking for existing event channels and resetting them). On the other hand, if checkpoints are taken more frequently than they are restored, it seems odd to be constantly detaching and reattaching back-ends in the parent. But if this is unsafe, it should be fairly easy to make the code do a full disconnect before suspend. It might be as easy as changing xm save to write 'suspend' to control/shutdown instead of 'checkpoint'. On Friday, 15 December 2006 at 08:07, Steven Hand wrote:Pretty much any PT race in a non-live save/migrate is a bug; the domain is (in theory) suspended at this point, and all of the devices are disconnected. Since you've chosen not to 'disconnect' the devices, you'll get random updates occuring to any shared pages (shared via grants or directly shared with Xen).I'm not too sure about the last couple of patches in this series. Because the checkpointing domain doesn't disconnect before calling suspend, it retains a few references to pages it doesn't own. These trigger a PT race detector in xc_linux_save, which causes it to abort. So the last couple of patches explicitly identify the references I've found so far (shared_info and some grant table shared pages) and simply zero those PTEs during save, since they'll be recreated on restore. Finding the grant table pages is a bit fragile - I walk the page table loaded in CR3 at the time of suspend looking for the virtual address I've stowed in the suspend record. I've only got code for two-level page tables at the moment, since I'm not convinced this is the right approach. Under what circumstances would a non-livesave have an unsafe PTE race?Maybe it's fine to simply zero these ptes without checking them.I'd think not.to clarify, the pages that have caused races in my experiments are always the same 5: shared_info and four grant table shared pages. The reason these don't cause races in plain save is simply that they are unmapped before suspend is called. Since I've adjusted the kernel to recreate these specific pages on restore (but not in the parent when checkpoint returns), my patches do just zero out the PTEs (simulating in the save code what had previously been done in the guest). Finding the guest grant table pages is a little annoying though. I ended up having the guest put the virtual address of its mapping into an unused field in the suspend record, then walking the page table to find the MFN. I was thinking it might be better to either get Xen to export a list of pages that the guest has references to, or to assume that any unowned MFNs in the page tables are either pages that will be recreated on restore anyway and just zero them out. In short, I wonder how often that PT race code has stopped a non-live save. If the answer is 'never', then zeroing out the PTEs might be fine. Especially since the original domain is still intact after the checkpoint. Thanks again for looking this over. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel -- TAMURA, Yoshiaki NTT Cyber Space Labs OSS Computing Project Kernel Group E-mail: tamura.yoshiaki@xxxxxxxxxxxxx TEL: (046)-859-2771 FAX: (046)-855-1152 Address: 1-1 Hikarinooka, Yokosuka Kanagawa 239-0847 JAPAN _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |