Xen project Mailing List

Re: [Xen-devel] IO speed limited by size of IO request (for RBD driver)

To: Felipe Franciosi <felipe.franciosi@xxxxxxxxxx>

From: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>

Date: Fri, 24 May 2013 10:29:38 -0400

Cc: Roger Pau Monne <roger.pau@xxxxxxxxxx>, Steven Haigh <netwiz@xxxxxxxxx>, "xen-devel@xxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxx>

Delivery-date: Fri, 24 May 2013 14:30:17 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On Thu, May 23, 2013 at 07:22:27AM +0000, Felipe Franciosi wrote: > > > On 22 May 2013, at 21:13, "Konrad Rzeszutek Wilk" <konrad.wilk@xxxxxxxxxx> > wrote: > > > On Wed, May 08, 2013 at 11:14:26AM +0000, Felipe Franciosi wrote: > >> However we didn't "prove" it properly, I think it is worth mentioning that > >> this boils down to what we originally thought it was: > >> Steven's environment is writing to a filesystem in the guest. On top of > >> that, it's using the guest's buffer cache to do the writes. > > > > If he is using O_DIRECT it bypasses the cache in the guest. > > Certainly, but the issues were when _not_ using O_DIRECT. I am confused. Are the feature-indirect-descriptor making it worst or better when !O_DIRECT? Or are there no difference when using !O_DIRECT with the feature-indirect-descriptor? > > F > > > > > >> This means that we cannot (easily?) control how the cache and the fs are > >> flushing these writes through blkfront/blkback. echo 3 > /proc/..something/drop_cache does it? > >> > >> In other words, it's very likely that it generates a workload that simply > >> doesn't perform well on the "stock" PV protocol. 'fio' is an excellent tool to run the tests without using the cache. > >> This is a good example of how indirect descriptors help (remembering Roger > >> and I were struggling to find use cases where indirect descriptors showed > >> a substantial gain). You mean using the O_DIRECT? Yes, all tests that involve any I/O should use O_DIRECT. Otherwise they are misleading. And my understanding from this thread that Steven did that and found that: a) without the feature-indirect-descriptor - the I/O was sucky b) with the initial feature-indirect-descriptior - the I/O was less sucky c) with the feature-indirect-descriptor and a tweak to the frontend of how mant segments to use - the I/O was the same as on baremetal. Sorry about being soo verbose here - I feel that I am missing something here and I am not exactly sure what this is. Could you please enlighten me? > >> > >> Cheers, > >> Felipe > >> > >> -----Original Message----- > >> From: Roger Pau Monne > >> Sent: 08 May 2013 11:45 > >> To: Steven Haigh > >> Cc: Felipe Franciosi; xen-devel@xxxxxxxxxxxxx > >> Subject: Re: IO speed limited by size of IO request (for RBD driver) > >> > >> On 08/05/13 12:32, Steven Haigh wrote: > >>> On 8/05/2013 6:33 PM, Roger Pau Monné wrote: > >>>> On 08/05/13 10:20, Steven Haigh wrote: > >>>>> On 30/04/2013 8:07 PM, Felipe Franciosi wrote: > >>>>>> I noticed you copied your results from "dd", but I didn't see any > >>>>>> conclusions drawn from experiment. > >>>>>> > >>>>>> Did I understand it wrong or now you have comparable performance on > >>>>>> dom0 and domU when using DIRECT? > >>>>>> > >>>>>> domU: > >>>>>> # dd if=/dev/zero of=output.zero bs=1M count=2048 oflag=direct > >>>>>> 2048+0 records in > >>>>>> 2048+0 records out > >>>>>> 2147483648 bytes (2.1 GB) copied, 25.4705 s, 84.3 MB/s > >>>>>> > >>>>>> dom0: > >>>>>> # dd if=/dev/zero of=output.zero bs=1M count=2048 oflag=direct > >>>>>> 2048+0 records in > >>>>>> 2048+0 records out > >>>>>> 2147483648 bytes (2.1 GB) copied, 24.8914 s, 86.3 MB/s > >>>>>> > >>>>>> > >>>>>> I think that if the performance differs when NOT using DIRECT, the > >>>>>> issue must be related to the way your guest is flushing the cache. > >>>>>> This must be generating a workload that doesn't perform well on Xen's > >>>>>> PV protocol. > >>>>> > >>>>> Just wondering if there is any further input on this... While DIRECT > >>>>> writes are as good as can be expected, NON-DIRECT writes in certain > >>>>> cases (specifically with a mdadm raid in the Dom0) are affected by > >>>>> about a 50% loss in throughput... > >>>>> > >>>>> The hard part is that this is the default mode of writing! > >>>> > >>>> As another test with indirect descriptors, could you change > >>>> xen_blkif_max_segments in xen-blkfront.c to 128 (it is 32 by > >>>> default), recompile the DomU kernel and see if that helps? > >>> > >>> Ok, here we go.... compiled as 3.8.0-2 with the above change. 3.8.0-2 > >>> is running on both the Dom0 and DomU. > >>> > >>> # dd if=/dev/zero of=output.zero bs=1M count=2048 > >>> 2048+0 records in > >>> 2048+0 records out > >>> 2147483648 bytes (2.1 GB) copied, 22.1703 s, 96.9 MB/s > >>> > >>> avg-cpu: %user %nice %system %iowait %steal %idle > >>> 0.34 0.00 17.10 0.00 0.23 82.33 > >>> > >>> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s > >>> avgrq-sz avgqu-sz await svctm %util > >>> sdd 980.97 11936.47 53.11 429.78 4.00 48.77 > >>> 223.81 12.75 26.10 2.11 101.79 > >>> sdc 872.71 11957.87 45.98 435.67 3.55 49.30 > >>> 224.71 13.77 28.43 2.11 101.49 > >>> sde 949.26 11981.88 51.30 429.33 3.91 48.90 > >>> 225.03 21.29 43.91 2.27 109.08 > >>> sdf 915.52 11968.52 48.58 428.88 3.73 48.92 > >>> 225.84 21.44 44.68 2.27 108.56 > >>> md2 0.00 0.00 0.00 1155.61 0.00 97.51 > >>> 172.80 0.00 0.00 0.00 0.00 > >>> > >>> # dd if=/dev/zero of=output.zero bs=1M count=2048 oflag=direct > >>> 2048+0 records in > >>> 2048+0 records out > >>> 2147483648 bytes (2.1 GB) copied, 25.3708 s, 84.6 MB/s > >>> > >>> avg-cpu: %user %nice %system %iowait %steal %idle > >>> 0.11 0.00 13.92 0.00 0.22 85.75 > >>> > >>> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s > >>> avgrq-sz avgqu-sz await svctm %util > >>> sdd 0.00 13986.08 0.00 263.20 0.00 55.76 > >>> 433.87 0.43 1.63 1.07 28.27 > >>> sdc 202.10 13741.55 6.52 256.57 0.81 54.77 > >>> 432.65 0.50 1.88 1.25 32.78 > >>> sde 47.96 11437.57 1.55 261.77 0.19 45.79 > >>> 357.63 0.80 3.02 1.85 48.60 > >>> sdf 2233.37 11756.13 71.93 191.38 8.99 46.80 > >>> 433.90 1.49 5.66 3.27 86.15 > >>> md2 0.00 0.00 0.00 731.93 0.00 91.49 > >>> 256.00 0.00 0.00 0.00 0.00 > >>> > >>> Now this is pretty much exactly what I would expect the system to do.... > >>> ~96MB/sec buffered, and 85MB/sec direct. > >> > >> I'm sorry to be such a PITA, but could you also try with 64? If we have to > >> increase the maximum number of indirect descriptors I would like to set it > >> to the lowest value that provides good performance to prevent using too > >> much memory. > >> > >>> So - it turns out that xen_blkif_max_segments at 32 is a killer in the > >>> DomU. Now it makes me wonder what we can do about this in kernels that > >>> don't have your series of patches against it? And also about the > >>> backend stuff in 3.8.x etc? > >> > >> There isn't much we can do regarding kernels without indirect descriptors, > >> there's no easy way to increase the number of segments in a request. > >> > >> > >> _______________________________________________ > >> Xen-devel mailing list > >> Xen-devel@xxxxxxxxxxxxx > >> http://lists.xen.org/xen-devel > >> > > _______________________________________________ > Xen-devel mailing list > Xen-devel@xxxxxxxxxxxxx > http://lists.xen.org/xen-devel > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.