[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-users] Xen Remus DRBD dual primary frozen



Dear all,

I have sent this problem earlier but maybe its not detail, here I try to write more detail. I hope anybody can help me to point out the problem.
First of all I used Ubuntu 12.04 x64 both for domain0 and domainU with modification to run under xen hypervisor and work with remus.
I follow and configured the remus with this notes http://wiki.xen.org/wiki/Install_Xen_4.1.4_with_Remus_and_DRBD_on_Ubuntu_12.10 but I used xen 4.2.2 as my hypervisor with DRBD 3.8.11 remus support from this link http://remusha.wikidot.com/local--files/configuring-and-installing-remus/drbd-8.3.11-remus.tar.gz.

If DRBD run with Primary - secondary mode, there is no problem. However remus run with dual primary mode. If I try to run remus the drbd will freeze and cause my domainU to freeze. With dmesg error message is below :

[242525.600067] block drbd1: Local backing block device frozen?
[242537.632070] block drbd1: Local backing block device frozen?
[242549.664075] block drbd1: Local backing block device frozen?
[242561.696083] block drbd1: Local backing block device frozen?
[242573.728079] block drbd1: Local backing block device frozen?
[242585.760069] block drbd1: Local backing block device frozen?
[242597.792079] block drbd1: Local backing block device frozen?
[242609.824069] block drbd1: Local backing block device frozen?
[242621.856083] block drbd1: Local backing block device frozen?
[242633.888068] block drbd1: Local backing block device frozen?
[242640.332124] INFO: task blkback.2.xvda:5779 blocked for more than 120 seconds.
[242640.332130] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[242640.332134] blkback.2.xvda  D ffff88003fc13780     0  5779      2 0x00000000
[242640.332142]  ffff880026743940 0000000000000246 000000000000000b ffff8800267402d0
[242640.332150]  ffff880026743fd8 ffff880026743fd8 ffff880026743fd8 0000000000013780
[242640.332157]  ffff880032944500 ffff88003368c500 ffff8800357d6000 ffff8800357d69d8
[242640.332164] Call Trace:
[242640.332178]  [<ffffffff816579cf>] schedule+0x3f/0x60
[242640.332200]  [<ffffffffa00e68d5>] drbd_al_begin_io+0x205/0x270 [drbd]
[242640.332207]  [<ffffffff811adde8>] ? bvec_alloc_bs+0x68/0x100
[242640.332212]  [<ffffffff811adf32>] ? bio_alloc_bioset+0xb2/0xf0
[242640.332219]  [<ffffffff8108aa50>] ? add_wait_queue+0x60/0x60
[242640.332231]  [<ffffffffa00e41bd>] drbd_make_request_common+0xc4d/0x1430 [drbd]
[242640.332239]  [<ffffffffa01b83ce>] ? xen_blkbk_map+0x24e/0x2f0 [xen_blkback]
[242640.332245]  [<ffffffff81301006>] ? throtl_find_tg+0x46/0x60
[242640.332257]  [<ffffffffa00e4e04>] drbd_make_request+0x464/0x7e0 [drbd]
[242640.332264]  [<ffffffff812f03bb>] ? generic_make_request_checks+0x1eb/0x370
[242640.332269]  [<ffffffff812f0194>] generic_make_request.part.50+0x74/0xb0
[242640.332274]  [<ffffffff812f05a8>] generic_make_request+0x68/0x70
[242640.332278]  [<ffffffff812f0635>] submit_bio+0x85/0x110
[242640.332284]  [<ffffffffa01b8f0f>] dispatch_rw_block_io+0x44f/0x700 [xen_blkback]
[242640.332292]  [<ffffffff8100330e>] ? xen_end_context_switch+0x1e/0x30
[242640.332298]  [<ffffffffa01b93df>] __do_block_io_op+0x21f/0x360 [xen_blkback]
[242640.332304]  [<ffffffffa01b9608>] xen_blkif_schedule+0xb8/0x320 [xen_blkback]
[242640.332309]  [<ffffffff8108aa50>] ? add_wait_queue+0x60/0x60
[242640.332314]  [<ffffffffa01b9550>] ? xen_blkif_be_int+0x30/0x30 [xen_blkback]
[242640.332319]  [<ffffffff81089fbc>] kthread+0x8c/0xa0
[242640.332326]  [<ffffffff81664034>] kernel_thread_helper+0x4/0x10
[242640.332330]  [<ffffffff816620e3>] ? int_ret_from_sys_call+0x7/0x1b
[242640.332336]  [<ffffffff81659dbc>] ? retint_restore_args+0x5/0x6
[242640.332340]  [<ffffffff81664030>] ? gs_change+0x13/0x13
[242645.920070] block drbd1: Local backing block device frozen?
[242657.952074] block drbd1: Local backing block device frozen?
[242669.984072] block drbd1: Local backing block device frozen?
[242682.016071] block drbd1: Local backing block device frozen?
[242694.048071] block drbd1: Local backing block device frozen?
[242706.080071] block drbd1: Local backing block device frozen?
[242718.112077] block drbd1: Local backing block device frozen?
sb-voip2@sbvoip2:~$ sudo cat /proc/drbd
version: 8.3.11 (api:88/proto:86-96)
GIT-hash: 0de839cee13a4160eed6037c4bddd066645e23c5 build by root@sbvoip2, 2013-02-19 08:30:51

 1: cs:Connected ro:Primary/Primary ds:UpToDate/UpToDate D r-----
    ns:14732 nr:1784712 dw:1799444 dr:579340 al:31 bm:44 lo:1 pe:0 ua:0 ap:1 ep:1 wo:b def:0 chkpt:662 oos:0

As we can read after drbd block device frozen then blkback also not working 

[242640.332124] INFO: task blkback.2.xvda:5779 blocked for more than 120 seconds.

Some one told me its because high load of IO but I alwasy monitor my server with xm top and the serer load always under 50%
I hope anybody can help me, if you need some more log I will try to post it.

Many thanks,

Agya
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
http://lists.xen.org/xen-users

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.