[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-users] Block device caching causes out of memory in dom0
We run our dom0's with 128MB of RAM using xen 2.0.6, kernel 2.6.11 for dom0 and domUs, and typically have ~10 domUs per server. All the domU filingsystems are loopback files served from dom0 using the file method. We've found that really busy servers sometimes get out of memory problems on dom0. This appears as lots of x-order memory allocation failures with a variable backtrace in some random bit of kernel code (unfortunately I've lost the backtraces :-( ) Looking at the memory statistics this seems to be caused entirely by caching of the loop devices. I ran a little test to prove to myself that this was the case. I ran this on a dom0 which was doing nothing (load average: 0.01, 0.00, 0.00) apart from serving vbd's to domUs. cat /proc/meminfo > 1 perl -e '$a="a"x(5*1024*1024)' # note this allocates 10MB of memory! cat /proc/meminfo > 2 perl -e '$a="a"x(10*1024*1024)' cat /proc/meminfo > 3 perl -e '$a="a"x(15*1024*1024)' cat /proc/meminfo > 4 sleep 1m; cat /proc/meminfo > 5 sleep 1m; cat /proc/meminfo > 6 grep ^Cached: 1 2 3 4 5 6 With the following result 1:Cached: 92232 kB <- resting state 2:Cached: 82828 kB <- after 10MB memory allocation 3:Cached: 72976 kB <- after 20MB memory allocation 4:Cached: 62952 kB <- after 30MB memory allocation 5:Cached: 70420 kB <- after 60s wait 6:Cached: 80696 kB <- after 60s wait The Cached hovers at about 92 MB. The perl processes take stuff directly out of Cached: as expected. However Cached: grows at the rate of about 9MB per minute. According to vmstat this server is averaging about 150 kbyte/s disk IO which is 9MB / minute, spot on the rate the cache is growing. The server is doing nothing apart from serving the vbds so therefore the loop files must be being cached. It seems to me that dom0 shouldn't need to cache the loop devices at all because the domUs will have their own buffer caches? Is there any way of stopping xen caching the loop devices? I thought about mapping a /dev/raw/rawX device onto /dev/loopX (which works) but the /dev/raw/rawX devices are character devices not block devices so I'm not sure how Xen would take to that. Can Xen use the new O_DIRECT feature directly when it opens the block device? I imagine its doing it from kernel space though so things may be different there. Would this be less of a problem if we switched over to using partitions (lvm say)? We are slightly reluctant to do this due to previous bad experiences with lvm though! Or is there another cause of the problem? Thanks -- Nick Craig-Wood <nick@xxxxxxxxxxxxxx> -- http://www.craig-wood.com/nick _______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-users
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |