[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-users] Remus help

  • To: xen-users@xxxxxxxxxxxxxxxxxxx
  • From: Jonathan Kirsch <kirsch.jonathan@xxxxxxxxx>
  • Date: Fri, 3 Sep 2010 16:06:26 -0700
  • Delivery-date: Fri, 03 Sep 2010 16:07:43 -0700
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=bc3VTxikkW6XG67nZLYn7QCaWF7yDEZv+tcgHZnWJau4o9B/XL0rPUX+9Cab/Y33ut WZVdxa9/vKKdtuhFfLPBjne/JSvoOc+7GrpfgvIwaD01T8+LvDkcgwDfw2xGS2FWg2Ph llvC57bGjHH1oem1un+pnAYyHnfdqXQhTkHmk=
  • List-id: Xen user discussion <xen-users.lists.xensource.com>


Help!  I'm trying to test Remus and cannot successfully get the backup domU to start up.  In brief, I see checkpoint traffic being transferred, and the backup does switch from
the paused state to the running state (apparently), but the backup VM is essentially frozen -- I can view its console but cannot interact with it at all.  It is as if the VM is stuck
in an old state and never resumes.  Here are some details:

-Two boxes (Box 1 and Box2), both running pvops kernel in dom0.
-Created an HVM guest (Fedora 12) and confirmed that it can start on both machines

Here is the Remus-relevant line from my xen config file:

disk = [ 'tap2:tapdisk:remus:|aio:/home/jak/remus/XenGuest1.img,hda,w' ]

(Note that the suggested tap:remus... syntax does not work for me, which is why I am using the tap2 syntax).

Here is the procedure that I follow:

1. Start the guest VM on Box 1

2. Start running a perl script on the guest VM that simply prints out a number every second:
    for i = 1 to 10000
        print i

3. Start remus on Box 1:
   sudo remus --no-net XenGuest1

I see checkpoint traffic being transferred.  After a while, it looks like the backup is essentially caught up and
the amount of checkpoint traffic decreases considerably (but there is still some flowing, as I would expect).

4. Now I pull the network cable from Box 1.  Expected behavior: the backup VM on Box 2 should start and
continue printing out one number per second, picking up where the primary left off.

Actual behavior: When I do an xm list, I do indeed see that the backup VM is running on Box 2.  Strangely, when I do an xm top, it tells me that
both Domain-0 and XenGuest1 are using 100% CPU.  Not sure this makes any sense.  As noted above, I connect to vnc and see the screen
of the backup VM.  However, I cannot interact with the VM at all.  I can't interact with the desktop, I don't see numbers printing out anymore...basically
the whole thing is frozen, despite xm telling me that it's "running."

Does anyone know what's going on?  Or if this is not the right list to be emailing, might someone point me to the correct list?

Xen-users mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.