[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-users] Xen3.3 / Xen3.4 CPU soft lockups under pvops 2.6.31/2.6.32
Pim van Riezen a Ãcrit : > Good day, > > We're trying to get 2.6.31 and 2.6.32 rolled out on our clusters to offer > newer features like FUSE fo our customers, but we're ran into a couple of > showstopper issues when deploying these kernels on busier guests, showing a > lot of errors like this: > > BUG: soft lockup - CPU#0 stuck for 561s! [swapper:0] > Modules linked in: > CPU 0: > Modules linked in: > Pid: 0, comm: swapper Not tainted 2.6.32.9xls-domU #2 > RIP: e030:[<ffffffff810093aa>] [<ffffffff810093aa>] > hypercall_page+0x3aa/0x1001 > RSP: e02b:ffffffff81691f70 EFLAGS: 00000246 > RAX: 0000000000000000 RBX: ffffffff81690000 RCX: ffffffff810093aa > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000001 > RBP: ffffffff81896d30 R08: 0000000000000000 R09: ffffffff8100e3b2 > R10: 0000000000000001 R11: 0000000000000246 R12: ffffffffffffffff > R13: ffffffff818ebf20 R14: ffffffff818eec70 R15: 0000000000000000 > FS: 00007f8ac7a9c6e0(0000) GS:ffff8800022ac000(0000) knlGS:0000000000000000 > CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b > CR2: 00007f291b4c9000 CR3: 000000007d8c1000 CR4: 0000000000002660 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Call Trace: > [<ffffffff8100ddb7>] ? xen_safe_halt+0xc/0x15 > [<ffffffff8100bdcf>] ? xen_idle+0x37/0x40 > [<ffffffff8100fe2e>] ? cpu_idle+0x4f/0x82 > [<ffffffff818b6c42>] ? start_kernel+0x353/0x35f > > in our hope to get rid of this issue we upgraded from Xen 3.3 to Xen 3.4.1.7 > out of the gitco repos. The issue persisted. Is there a magic version of Xen, > preferably one that can be found in an rpm repository for CentOS 5, that > *does* properly support pvops kernels without these issues? > > I'm also seing this one: > http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=543 where the bug is > still open but no activity since 2008. I don't know if that bugzilla is still > being actively maintained? > > Cheers, > Pim > > > Hi Pim, I am having similar issues, with differents versions of Xen hypervisor (3.2, 3.4) and with domU kernels >= 2.6.26 (until 2.6.32-4 from Debian), always on the same 2 or 3 VMs that are frequently under heavy load. After searching a lot, I thought that my CPU softlock problems (which sometimes make my VMs freezing) was perhaps related to the xen clocksource, so I decided to give a try to this : http://wiki.debian.org/Xen#A.27clocksource.2BAC8-0.3ATimewentbackwards.27 Using jiffies + independant wallclock + ntp in domU seems to have stop the CPU softlock error messages in kernel messages (at least I didn't have any since I use it, but it's only for 2 days...). Now I am crossing my fingers... :-) I also read in xen-devel that you are using FC LUNs for storage, I also use that, perhaps you will want to have a look at the "Interrupt handling in Xen" message that was post on this list yesterday, by defaults my domain-0 was doing all its interrupts (network and HBA) on the same CPU, which is probably some kind of bottleneck under heavy load. Cheers, -- Yann CÃzard - Administrateur SystÃmes Serveurs Centre de Ressources Informatiques - http://cri.univ-pau.fr Università de Pau et des Pays de l'Adour - http://www.univ-pau.fr Bat IFR, rue Jules Ferry, 64000 PAU - TÃl.: +33 (0)5 59 40 77 94 _______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-users
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |