[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCHv3] QEMU(upstream): Disable xen's use of O_DIRECT by default as it results in crashes.
Paolo, --On 18 March 2013 17:19:14 +0100 Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote: I remembered this incorrectly, sorry. It's not from a previous run, it's from the beginning of this run. See http://wiki.qemu.org/Migration/Storage for more information. A VM has a disk backed by NFS. It runs on node A, at which point pages are introduced to the page cache. It then migrates to node B, which entails starting the VM on node B while it is still running on node A. Closing has yet to happen on node A, but the file is already open on node B; anything that is cached on node B will never be invalidated. Thus, any changes done to the disk on node A during migration may not become visible on node B. This might be a difference between Xen and KVM. On Xen migration is made to a server in a paused state, and it's only unpaused when the migration to B is complete. There's a sort of extra handshake at the end. I believe what's happening is that libxl_domain_suspend when called with LIBXL_SUSPEND_LIVE will do a final fsync()/fdatasync() at the end, then await a migrate_receiver_ready message, and only when that has been received will it send a migrate_permission_to_go message which unpauses the domain. Before that, I don't believe the disk is read (I may be wrong about that). The sending code is in migrate_domain() in xl_cmdimpl.c, and the receiving code is in migrate_receive() (same file). On xen at least, I don't think the VM is ever started on node B whilst it is still running on node A. I've no problem if xl or libvirt or whatever error or warn. My usage is API based, rather than xl / libvirt based.What makes libvirt not an API (just like libxl)? Nothing, just I'm using the QMP API and the libxl API. I'm just saying whether libvirt or xl warn or error makes no difference to me. If libxl does migration without O_DIRECT, then that's a bug in libxl. What about blkback? IIRC it uses bios, so it also bypasses the page cache.Possibly a bug in xl rather than libxl, but as no emulated devices use O_DIRECT, that bug is already there, and isn't in QEMU.blkback is the in-kernel PV device, it's not an emulated device. I mean that an emulated device will already not use O_DIRECT. So if you are right about live migrate being unsafe without O_DIRECT, it's already unsafe for emulated devices. Stefano did ack the patch, and for a one line change it's been through a pretty extensive discussion on xen-devel ...It may be a one-line change, but it completely changes the paths that I/O goes through. Apparently the discussion was not enough.What would you suggest?Nothing except fixing the bug in the kernel. I have already posted patches for that, as Ian Campbell did in 2008, but no one seems particularly interested. Be my guest in trying to get them adopted. That's quite obviously the long term solution. In the mean time, however, there is a need to run Xen on kernels with long term support. Not being able to run Xen in a stable manner is not an acceptable position. No one has yet explained why blkback is not susceptible to the same bug. I would guess it will be if it uses O_DIRECT or whatever the in kernel equivalent is, unless it's doing a copy of the guest pages prior to the write being marked as complete. I can't claim to be familiar with blkback, but I presume this would require a similar fix elsewhere. -- Alex Bligh _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |