[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] Re: poor domU VBD performance.
I am sorry to return to this issue after quite a long interruption. As I mentioned in a post before, I came accross this problem when I was testing file-system performance. After the problems with raw sequential I/O seemed to have been fixed in the testing release, I turned back to my original problem. I did a simple test that dispite its simplicity seems to put the IO subsystem under considerable stress. I took the /usr tree of my system and copied five it times into different directories on a slice of disk 1. This tree con- sistst of 36000 files with about 750 MB of data. Then I started to copy each of these copies recursively onto disk 2 ( each to its own location on that disk, of course ). I ran these copying in parallel and the processes took about 6 to 7 minutes in DOM0, while they needed between 14.6 and 15.9 minutes in DOMU. Essentially, this means that using this heavy io load on the system I get back to my 40% ratio between io performance on DOMU compared and io perfor- mance on DOM0 that I initially reported. This may just be coincidence, but probably it is worth mention. I monitored the disk and block-io activity with iostat. The output of both is too large to post it here, so I will only try to include a few representative lines of each. The first two lines show the activity while doing the copying on DOMU. This is a snapshot of a phase with relatively high throughput (DOMU): Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq- sz avgqu-sz await svctm %util hde 0.00 2748.00 1.60 71.20 12.80 22561.60 6.40 11280.80 310.09 1.78 23.96 4.73 34.40 hdg 2571.00 5.00 126.80 9.60 21580.80 115.20 10790.40 57.60 159.06 5.48 40.38 6.61 90.20 avg-cpu: %user %nice %system %iowait %idle 0.20 0.00 6.20 0.20 93.40 this is a snapshot of a phase with relatively low throughput (DOMU): Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq- sz avgqu-sz await svctm %util hde 0.00 676.40 0.00 33.00 0.00 5678.40 0.00 2839.20 172.07 1.76 53.45 4.91 16.20 hdg 335.80 11.00 315.00 3.40 5206.40 115.20 2603.20 57.60 16.71 4.15 13.02 2.76 87.80 avg-cpu: %user %nice %system %iowait %idle 0.20 0.00 9.00 0.00 90.80 _I suspect, that the reported iowait on cpu-usage is not entirely correct, but I am not sure about it. The next two lines are snapshots of iostat output during the copying in DOM0 again the first snapshot was taken in a phase of relative high throughput (DOM0): Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq- sz avgqu-sz await svctm %util hde 0.00 5845.40 1.40 110.20 11.20 47812.80 5.60 23906.40 428.53 105.96 772.63 8.96 100.00 hdg 46.20 24.80 389.80 2.20 47628.80 216.00 23814.40 108.00 122.05 7.12 18.23 3.30 129.40 avg-cpu: %user %nice %system %iowait %idle 2.40 0.00 40.20 57.40 0.00 the next line was taken in a phase of relatively low throughput (DOM0): Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq- sz avgqu-sz await svctm %util hde 0.00 903.40 0.20 106.80 3.20 7972.80 1.60 3986.40 74.54 20.77 217.91 4.06 43.40 hdg 0.00 24.00 746.60 1.20 9302.40 200.00 4651.20 100.00 12.71 4.96 6.67 1.34 100.00 avg-cpu: %user %nice %system %iowait %idle 3.40 0.00 44.00 52.60 0.00 The problem seems to be the reading. The device hde, which contains the slice where the data is copied onto is almost never really busy when using DOMU. The ratio of kb/s written and usage seems to reflect that writing from DOMU is just as efficient as writing from DOM0 ( writing can be buffered in both cases after all ). Yet the information on reading seems to show a different picture. Blockio merges requests permanently resulting in request sizes that are approxi- mately equal in both cases. Yet service times for DOMU requests are about twice the time needed for requests for DOM0. I do not know if such a scenario is simply inadequate for virtual systems at least under Xen. We are thinking about running a mail gateway on top of a protected and secured dom0 system, and potentially offering other network services in separate domains. We want to avoid corruption of DOM0 while being able to offer "insecure" services in nonprivileged domains. We know that mail servicing can potentially put an intense load onto the filesystem - admittedly more on inodes ( create and delete ) than with respect to data throughput. Do I simply have to accept that under heavy io load domains using vbd to access storage devices will lag behind dom0 and native linux systems, or is there a chance to fix this ? My reported test was done on a fujitsu-siemens system RX100 with a 2.0 Ghz Celeron CPU and a total of only 256 MB of memory. DOM0 had 128 MB and DOMU 100 MB. The disks were simply ide disks. I did the same test on a System with 1.25 GB Ram with both domains having 0.5 GB of memory. It contains SATA disks and the results are essentially the same the only difference is that both processes are slower due to less throughput under random access from the disks. Any advice ore help ? Thanks in advance Peter _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |