[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] drbd 8 primary/primary and xen migration on RHEL 5



On 2008-07-31 21:58, nathan@xxxxxxxxxxxx wrote:
I am running DRBD primary/primary on Centos 5.2 with CLVM and GFS with no problems. The only issue I have with live migration is that the arp takes 10 - 15 sec to get refreshed so you lose connectivity during that time. I have the problem with 3.0ish xen on Centos 5.2 as well as xen 3.2.1.
One can run a job on the vm to generate a packet every second or two to 
resolve this; ping in a loop should do it.
My scenario doesn't involve any clustered filesystem. I'm using phy: 
drbd devices as the backing for the vm, not files. As far as I 
understand things, a clustered filesystem shouldn't be necessary, as 
long as the drbd devices are in sync at the moment migration occurs.
But the question remains whether that condition is guaranteed, and I 
hope to hear from someone who knows the answer to that question...
Anyway, other then the ARP issue, I have this working in production with about two dozen DomUs.
Note: If you want to use LVM for xen rather then files on GFS/LVM/DRBD 
you need to run the latest DRBD that supports max-bio-bvecs.
I'm actually running drbd on top of LVM. But I'll look into the 
max-bio-bvecs thing anyway out of curiosity.
Thanks for the reply.

On Thu, 31 Jul 2008, Antibozo wrote:
Greetings.

I've reviewed the list archives, particularly the posts from Zakk, on this subject, and found results similar to his. drbd provides a block-drbd script, but with full virtualization, at least on RHEL 5, this does not work; by the time the block script is run, the qemu-dm has already been started.
Instead I've been simply musing the possibility of keeping the drbd 
devices in primary/primary state at all times. I'm concerned about a 
race condition, however, and want to ask if others have examined this 
alternative.
I am thinking of a scenario where the vm is running on node A, and has 
a process that is writing to disk at full speed, and consequently the 
drbd device on the node B is lagging. If I perform a live migration 
from node A to B under this condition, the local device on node B 
might not be in sync at the time the vm is started on that node. Maybe.
If I use drbd protocol C, theoretically at least, a sync on the device 
on node A shouldn't return until node B is fully in sync. So I guess 
my main question is: during migration, does xend force a device sync 
on node A before the vm is started on node B?
A secondary question I have (and this may be a question for the drbd 
folks as well) is: why is the block-drbd script necessary? I.e. why 
not simply leave the drbd primary/primary at all times--what benefit 
is there to marking the device secondary on the standby node?
Or am I just very confused? Does anyone else have thoughts or 
experience on this matter? All responses are appreciated.
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.