[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-users] disk I/O problems under load? (Xen-3.4.1/x86_64)
On Thu, Oct 01, 2009 at 01:33:57AM +0200, Luca Lesinigo wrote: > I'm getting problems whenever the load on a system increase, but IMHO > it should be well withing hardware capabilities. > > My configuration: > - HP Proliant DL160G5, with a single quadcore E5405, 14GiB RAM, 2x1TB > sata disks Hitachi 7K1000.B on the onboard sata controller (intel > chipset) > - Xen-3.4.1 64bit hypervisor, compiled from gentoo portage, with > default commandline settings (I just specify the serial console and > nothing else) > - Domain-0 with gentoo's xen-sources 2.6.21 (the xen 2.6.18 tarball > didn't have networking, I think the HP Tigon3 gigabit driver is too > old but hadn't time to look into that > - Domain-0 is using the CFQ i/o scheduler, and works from a software > raid-1, no tickless kernel, HZ=100. It has all the free ram (currently > some 5.x GiB) > - the rest of the disks is also mirrored in a raid-1 device, and I use > LVM2 on top of that > - 6x paravirt 64bit DomU with 2.6.29-gentoo-r5 kernel, with NOOP i/o > scheduler, tickless kernel. 1 - 1.5GiB of ram each. > - 1x HVM 32bit Windows XP DomU, without any paravirt driver, 512MiB RAM > - I use logical volumes as storage space for DomU's, the linux ones > also have 0.5GiB of swap space (unused, no DomU is swapping) > - all the linux DomU are on ext3 (noatime), and all DomU are single- > cpu (just one vcpu each) > - network is bridged (one lan and one wan interface on the physical > system and the same for each domU), no jumbo frames > > Usually load on the system is very low. But when there is some I/O > related load (I can easily trigger it by rsync'ing lots of stuff > between domU's or from a different system to one of the domU or to the > dom0) load gets very high and I often see domU's spending all their > cpu time in "wait" [for I/O] state. When that happens, load on > Domain-0 gets high (jumps from <1 to >5) and loads on DomU's get high > too probably because of processes waiting for I/O to happen. Sometimes > iostat will even show exactly 0.00 tps on all the dm-X devices (domU > storage backends) and some activity on the physical devices, like all > domU I/O activity froze up while dom0 is busy flushing caches or doing > some other stuff. > > vmstat in Dom0 shows one or two cores (25% or 50% cpu time) busy in > 'iowait' state, and context switches go in the thousands, but not in > the hundreths thousands that http://wiki.xensource.com/xenwiki/KnownIssues > talks about. > You have only 2x 7200 rpm disks for 7 virtual machines and you're wondering why there's a lot of iowait? :) > I tried pinning cpus: Domain-0 had its four VCPUs pinned to CPUs 0 and > 1, some domU's pinned to CPU 2, and some domU's pinned to CPU 3. As > far as I can tell it did not do any difference. > I also (briefly) tested with all linux DomU's running with the CFQ > scheduler, while it didn't seem to make any difference it also was too > short of a test to trust it much. > > What's worse, sometimes I get qemu-dm processes (for the HVM domU) in > zombie state. It also happened that the HVM domU crashed and I wasn't > able to restart it: I got the hotplug scripts not working error from > xm create, and looking in xenstore-ls I saw instances of the crashed > domU with all its resources (which probably was the cause of the > error?). Had the reboot the whole system to be able to start that > domain again. > > Normally iostat in Domain-0 shows more or less high tps (200~300 under > normal load, even higher if I play around with rsync to artificially > trigger the problems) on the md device where all the DomU reside, and > much less (usually just 10-20% of the previous value) on the two > physical disks sda and sdb that compose the mirror. I guess I see less > tps because the scheduler/elevator in Dom-0 is doing its job. > > I don't know if the load problems and the HVM problem are linked or > not, but I also don't know where to look to solve any one of them. > > Any help would be appreciated, thank you very much. Also, what are > ideal/recommended settings in dom0 and domU regarding i/o schedulers > and tickless or not? > Is there any reason to leave the hypervisor some extra free ram or it > is ok to just let xend shrink dom0 when needed and leave free just the > minimum? If I sum up memory (currently) used by domains, I get > 14146MiB. xm info says 14335MiB total_memory and 10MiB free_memory. > Single 7200 rpm SATA disk can do around 120 random IOPS.. 120 IO operations per second. 120 IOPS / 7 VMs = 17 IOPS available per VM. That's not much.. -- Pasi _______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-users
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |