[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-devel] HVM Save/Restore status.



 

> -----Original Message-----
> From: Tim Deegan [mailto:Tim.Deegan@xxxxxxxxxxxxx] 
> Sent: 25 April 2007 15:09
> To: Petersson, Mats
> Cc: xen-devel@xxxxxxxxxxxxxxxxxxx; Woller, Thomas
> Subject: Re: [Xen-devel] HVM Save/Restore status.
> 
> Hi, 
> 
> At 12:58 +0200 on 25 Apr (1177505885), Petersson, Mats wrote:
> > My "disk-stress" tests have the following status:
> > 1. SLES 9.3 using VNC as display has run for over 23 
> virtual hours, some
> > 40 or so hours since I set off the test without any failures.
> 
> That's great! 
> 
> > There's
> > one difference between this test and previous ones: I've 
> disabled the
> > blanking of the screen - there appears to be a problem 
> waking the screen
> > after some time, not sure why that would be. 
> 
> What are the symptoms there?  Is the guest still alive?  Is qemu-dm
> alive?  Does it respond on the network, and just have a 
> wedged console?
> (Might it be the keyboard + mouse that have got wedged?)

Good question. It turns out (from an attempt to stop the guest nicely
when I was going to reboot to have a new Linux-kernel with debug code in
it) that although the guest is still running, I have actually lost at
least:
- Network. I can't ping the guest or SSH to the guest on the IP address
it used to be when it first got an address from DHCP - presumably, the
IP address shouldn't change (it doesn't on other machines that get IP
address from the same DHCP server). 
- Keyboard. Pressing for example CTRL-C to stop the running application
doesn't work. No other keys appear to have any effect either.

It's unclear to me if any other operations are affected or not. [Time
seemed a bit funny too, but that may be my app - I haven't debugged that
yet. It kept cycling around a 2-3 second range around 23h14m18-20s (or
some such), where the time comes from "time()" - so perhaps there's
something wrong in the "gettimeofday" functionality too.] 

> 
> > 2. "Simple-guest" fails to restore on the second restore, 
> ending up with
> > the guest "killed". Scanning the xend.log, I find "error 
> zeroing magic
> > pages". Looking further down that path, it seems like it's 
> failing to do
> > "xc_map_foreign_range"... I'm adding some debug output to try to
> > determine where it goes wrong here. 
> 
> Strange.  Are you doing anything wierd with the ioreq or 
> xenstore pages
> in the simple guest?  Their PFNs should have been maintained 
> across the
> first save/restore cycle, and they were mappable the first time...

I'm trying to see what fails and where by printing something at every
failure point. So far I've tracked it down to somewhere inside the
function direct_remap_pfn_range... Not sure where in this function it
goes wrong or where in any of the called functions. As far as I can see,
there's not many things that can go wrong there... 

--
Mats
> 
> Cheers,
> 
> Tim.
> 
> -- 
> Tim Deegan <Tim.Deegan@xxxxxxxxxxxxx>, XenSource UK Limited
> Registered office c/o EC2Y 5EB, UK; company number 05334508
> 
> 
> 



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.