[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] Remus crashes only with Windows Server 2003

On Thu, Feb 9, 2012 at 10:29 AM, Antonio Colin <dftonywhite@xxxxxxxxxxx> wrote:
Hi again Shriram,

Thank you for your reply and explanation. You are right I need a different port, may be 9001 in that case, but see...
That was the full test but in fact I tested everything with one disk "(Unit C:)" and the same thing happens... if you think
that doing it that way would save more useful information in the logs I can save them again :).

The NFS mount is in /mnt/domus only to begin testing remus. I put one VM image there... start remus with --no-net and everything is fine.
The directory /home/remus is just to work with remus and disk replication and is not and NFS mount.

It is so strange that it works only for Linux!! (both are HVM)

And yes, if that directory was shared that might corrupt my disk and I also need DRBD to replicate the image... is that possible for img files?
and just one last question... after failover how can I get back the execution of the VM from the backup to the primary host once it is ready ?

Let me investigate the blktap2 issue first.
DRBD does not replicate img files. You would have to put them in a partition or lvm volume and
replicate that volume to the backup host. Whether you want to write the image directly to the volume or
create a File system in that volume and drop the image file there, is upto you.

Thank you so much!!!


From: rshriram@xxxxxxxxx
Date: Thu, 9 Feb 2012 00:35:15 -0800

Subject: Re: [Xen-users] Remus crashes only with Windows Server 2003
To: dftonywhite@xxxxxxxxxxx
CC: xen-users@xxxxxxxxxxxxxxxxxxx

On Wed, Feb 8, 2012 at 1:56 AM, Antonio Colin <dftonywhite@xxxxxxxxxxx> wrote:
Hello Shriram,

Just comming back to Remus HA, three weeks ago I sent this thread and the situation hasn't changed. You are right,
remus works properly with --no-net option.

There is actually this tapdisk related error in the syslog file in the primary host:
Jan 17 17:28:58 xen-backup tapdisk2[5795]: remus: could not bind server socket 11 to 98 Address already in use

Thanks for the logs.
 The first thing that pops out is:
['tap2', ['uname', 'tap2:remus:|aio:/home/remus/win2k3-exchange.img'], ['dev', 'ioemu:hda'], ['mode', 'w']],
['tap2', ['uname', 'tap2:remus:|aio:/home/remus/win2k3-exchange-d.img'], ['dev', 'ioemu:hdb'], ['mode', 'w']],

You have two tapdisk devices, but on the same port ? Each disk needs a different port, as a tcp connection is
established between primary and backup for each replicated disk.

Also when I boot up the VM (Windows Server 2003) from NFS

from NFS ? just to make sure that we are on same page, is the above directory /home/remus an NFS mount ?
i.e. is that win2k3-exchange.img "shared" between the primary and backup host ?
 If so, then remus disk replication will not work, as its based on a shared-nothing model.
 In fact, it could corrupt your disk badly. If disk consistency is not an issue, then you are better off
 running remus without disk replication (though there is no guarantee that the domain will failover properly).

and without remus or disk replication, in both the primary and the backup
there is in fact a vif attached to it which is bind to the bridge in the two cases.
I have the sch_plug module installed correctly in both hosts and everything works perfect for Linux systems.

Oh great. So network buffering is out of the picture. If it works for linux, it should work for windows too.
But it just cannot come true
for Windows.

I attach xend.log and syslog from primary and backup if you'd like to see further information in order to help me.

Thank you a lot!!


> From: rshriram@xxxxxxxxx
> Date: Fri, 13 Jan 2012 09:54:35 -0800
> To: xen-users@xxxxxxxxxxxxxxxxxxx
> CC: dftonywhite@xxxxxxxxxxx
> Subject: Re: [Xen-users] Remus crashes only with Windows Server 2003

> On Fri, Jan 13, 2012 at 9:05 AM, <xen-users-request@xxxxxxxxxxxxxxxxxxx> wrote:
> > I have setup Remus on Debian Squeeze and kernel 3.1.5. Remus and disk replication works perfect  for Ubuntu systems,
> > but when I start Remus for Windows Sever 2003 (running Microsoft Exchange Enterprise 2003) it crashes giving the
> > following error:
> >
> Is that Ubuntu VM a PV or HVM ?
> I presume that remus with --no-net works properly ?
> > root@neutrino:~/working-remus# xm create exchange-hvm.cfg
> > root@neutrino:~/working-remus# remus exchange-hvm
> > qemu logdirty mode: enable
> > xc: error: Error when writing to state file (4a) (errno 104) (104 = Connection reset by peer): Internal error
> > qemu logdirty mode: disable
> > PROF: resumed at 1326315866.106150
> > resuming QEMU
> > tc filter del dev vif3.0 parent ffff: proto ip pref 10 u32
> > RTNETLINK answers: Invalid argument
> > We have an error talking to the kernel
> > Exception xen.remus.util.PipeException: PipeException('tc failed: 2, No such file or directory',) in <bound method BufferedNIC.__del__ of <xen.remus.device.BufferedNIC object at 0x24b7510>> ignored
> This error tells me nothing. "Connection reset by peer" could result
> from a lot of issues.
> A. check the syslog in primary and backup, for errors related to tapdisk
> B. Check the xend.log file in backup
> C. If your system works with --no-net, then try to boot up the VM
> without remus, and make sure that
> there is a vif interface for the VM. And make sure that interface is
> on the bridge (if you have bridging enabled).
> Remus tries to install a network buffer (sch_plug) to the vif interface.
> > root@neutrino:~/working-remus#
> >
> > It seems that on the backup remus or Xen cannot assign a vif1.0 to the DomU since #ifconfig -a doesn't show a new vif there
> > when starting remus.
> >
> > Any help would be highly appreciated!
> >
> > Tony.
> _______________________________________________
> Xen-users mailing list
> Xen-users@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-users

Xen-users mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.