[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Commit moratorium to staging
On Fri, Nov 03, 2017 at 05:57:52PM +0000, George Dunlap wrote: > On 11/03/2017 02:52 PM, George Dunlap wrote: > > On 11/03/2017 02:14 PM, Roger Pau Monné wrote: > >> On Thu, Nov 02, 2017 at 09:55:11AM +0000, Paul Durrant wrote: > >>> Hmm. I wonder whether the guest is actually healthy after the migrate. > >>> One could imagine a situation where the storage device model (IDE in our > >>> case I guess) gets stuck in some way but recovers after a timeout in the > >>> guest storage stack. Thus, if you happen to try shut down while it is > >>> still stuck Windows starts trying to shut down but can't. Try after the > >>> timeout though and it can. > >>> In the past we did make attempts to support Windows without PV drivers in > >>> XenServer but xenrt would never reliably pass VM lifecycle tests using > >>> emulated devices. That was with qemu trad, but I wonder whether upstream > >>> qemu is actually any better particularly if using older device models > >>> such as IDE and RTL8139 (which are probably largely unmodified from trad). > >> > >> Since I've been looking into this for a couple of days, and found no > >> solution I'm going to write what I've found so far: > >> > >> - The issue only affects Windows guests. > >> - It only manifests itself when doing live migration, non-live > >> migration or save/resume work fine. > >> - It affects all x86 hardware, the amount of migrations in order to > >> trigger it seems to depend on the hardware, but doing 20 migrations > >> reliably triggers it on all the hardware I've tested. > > > > Not good. > > > > You said that Windows reported that the login process failed somehow? > > > > Is it possible something bad is happening, like sending spurious page > > faults to the guest in logdirty mode? > > > > I wonder if we could reproduce something like it on Linux -- set a build > > going and start localhost migrating; a spurious page fault is likely to > > cause the build to fail. > > Well, with a looping xen-build going on in the guest, I've done 40 local > migrates with no problems yet. > > But Roger -- is this on emulated devices only, no PV drivers? > > That might be something worth looking at. Yes, windows doesn't have PV devices. But save/restore and non-live migration seems fine, so it doesn't look to be related to devices, but rather to log-dirty or some other aspect of live-migration. Or maybe it's something indeed related to emulated devices that's more easily triggerable on live-migration. I'm also thinking it would be helpful to do x20 save/restore, shutdown, create, x20 migrations and shutdown. That would help us identify problems related to save/restore and live-migration more easily. Roger. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |