[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-users] Questions about pass through block devices



What happens to the block layer if the block request queue gets very, very long?

Imagine there are many DomUs and they are all working against VERY SLOW disks.

I wish I understood the entire chain better, and I'm sorry to ask such a vague question.

I understand this is a weird question, because the performance implications of this scenario are horrific. The reason I ask is that I have an intuitive gut feeling that the people who have experienced Dom0 crashes due to heavy disk I/O (which includes me!) on Xen 3.0.2-3.0.4 may be hitting a corner case that most users would never experience.

In our case, we have Coraid AoE SAN storage that is accessed via drivers in the Dom0s of our cluster, and CLVM LVs are passed through into the DomUs.

Early on, we made a terrible disk-layout decision that left us with very poor disk I/O performance. That poor disk I/O performance was exacerbated by the fact that nearly every DomU has two LVs attached, and one is GFS, which adds another layer of performance degradation due to DLM locking.

We used to crash a lot. Now that we've re-organized the disks, we're crashing far less often. The change was made to improve performance, and we never in a million years imagined it might be related to this crashing behavior.

It suddenly occurred to me that perhaps we're crashing less often because the block request delivery times have improved dramatically, and the queues are likely growing shorter.

The only direct evidence that I and one other list member have seen that could be related to this issue is that accessing /dev/slabinfo on a regular basis in the Dom0 appears to greatly increase the frequency of crashes, which leads me to believe that the issue, at its core, is somehow related to SLAB corruption in Dom0.

Another list member has also recently reported that this issue appears to be fixed in the unstable line. Have changes been made in unstable that would shed some light on any of this?

Thanks, and sorry for the novelette.

--
-- Tom Mornini, CTO
-- Engine Yard, Ruby on Rails Hosting
-- Reliability, Ease of Use, Scalability
-- (866) 518-YARD (9273)


_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.