[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Making snapshot of logical volumes handling HVM domU causes OOPS and instability
On 08/30/2010 03:13 PM, Daniel Stodden wrote: Are you sure it's spinning or just freezing? I'm not sure that I understand the difference between those two terms, so I'm going to guess "freezing" is probably a more accurate description. The best way to describe what I was seeing was that my scripted backup procedure would get to a certain point and freeze, then I wouldn't be able to break out of it or issue a kill from another SSH session on its PID. The kill command freezes the same way (never returns to a shell prompt and pressing CTRL-C just shows ^C on the display without breaking out). Can you try find the minimum number of steps necessary to get into that state and try sth like $ ps -eH -owchan,nwchan,cmd The minimum number of steps that I took, just now, to make it happen was as follows: There's an HVM domU that's active and running Windows 2008 Server, called "scrappy", with the following Xen configuration: kernel = "hvmloader" builder='hvm' memory = 768 name = "scrappy" vcpus=1 vif = [ 'type=ioemu, mac=00:16:3e:00:00:18, bridge=eth0','type=ioemu, mac=00:16:3e:00:00:19, bridge=xenbr1','type=ioemu, mac=00:16:3e:00:00:1A, bridge=xenbr2' ] disk = [ 'phy:hurricanevg1/scrappy-primarymaster,xvda,w', 'file:/mnt/scratch/WindowsServerStd2008OEM_x86-64.iso,xvdb:cdrom,r', 'phy:hurricanevg1/scrappy-secondarymaster,xvdc,w' ] on_reboot = 'restart' device_model = 'qemu-dm' sdl=0 opengl=1 vnc=1 vnclisten="192.168.0.90" vncdisplay=3 vncunused=1 stdvga=0 serial='pty' tsc_mode=0 localtime=1 rtc_timeoffset=-3600 While that's running, I created a snapshot of the primarymaster volume, then removed it, created one for the secondarymaster, removed it, and created another one for the primarymaster, tried to remove it, and the lvremove command froze. A minute or two later, I got a similar kernel OOPS message on my console to the one that I posted before. These are the commands that I used to create and remove the volumes: lvcreate -L 2G -n scrappy-primarymaster-backupsnap -s hurricanevg1/scrappy-primarymaster lvremove hurricanevg1/scrappy-primarymaster-backupsnap lvcreate -L 2G -n scrappy-secondarymaster-backupsnap -s hurricanevg1/scrappy-secondarymaster lvremove hurricanevg1/scrappy-secondarymaster-backupsnap lvcreate -L 2G -n scrappy-primarymaster-backupsnap -s hurricanevg1/scrappy-primarymaster lvremove hurricanevg1/scrappy-primarymaster-backupsnap This time, the console froze completely and I couldn't open any new SSH sessions into the machine, and couldn't run the ps -eH command that you asked for in your previous message. If I go for another attempt, I'll try to have a few logins already going so I can try to get that output for you. This is a somewhat critical, production server, though, so I didn't want to keep bouncing it in the middle of the day. Also, is that sequence completely reproducible or does the behaviour change evertime? Just trying if there's some point where deadlock ends and corruption like the one quoted below would start. It seems to be 3 for 3 at this point. -- Scott Garron _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |