|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] IO speed limited by size of IO request (for RBD driver)
On Thu, May 23, 2013 at 07:22:27AM +0000, Felipe Franciosi wrote:
>
>
> On 22 May 2013, at 21:13, "Konrad Rzeszutek Wilk" <konrad.wilk@xxxxxxxxxx>
> wrote:
>
> > On Wed, May 08, 2013 at 11:14:26AM +0000, Felipe Franciosi wrote:
> >> However we didn't "prove" it properly, I think it is worth mentioning that
> >> this boils down to what we originally thought it was:
> >> Steven's environment is writing to a filesystem in the guest. On top of
> >> that, it's using the guest's buffer cache to do the writes.
> >
> > If he is using O_DIRECT it bypasses the cache in the guest.
>
> Certainly, but the issues were when _not_ using O_DIRECT.
I am confused. Are the feature-indirect-descriptor making it worst or better
when
!O_DIRECT?
Or are there no difference when using !O_DIRECT with the
feature-indirect-descriptor?
>
> F
>
>
> >
> >> This means that we cannot (easily?) control how the cache and the fs are
> >> flushing these writes through blkfront/blkback.
echo 3 > /proc/..something/drop_cache
does it?
> >>
> >> In other words, it's very likely that it generates a workload that simply
> >> doesn't perform well on the "stock" PV protocol.
'fio' is an excellent tool to run the tests without using the cache.
> >> This is a good example of how indirect descriptors help (remembering Roger
> >> and I were struggling to find use cases where indirect descriptors showed
> >> a substantial gain).
You mean using the O_DIRECT? Yes, all tests that involve any I/O should use
O_DIRECT.
Otherwise they are misleading. And my understanding from this thread that
Steven did that
and found that:
a) without the feature-indirect-descriptor - the I/O was sucky
b) with the initial feature-indirect-descriptior - the I/O was less sucky
c) with the feature-indirect-descriptor and a tweak to the frontend of how mant
segments to use - the I/O was the same as on baremetal.
Sorry about being soo verbose here - I feel that I am missing something here and
I am not exactly sure what this is. Could you please enlighten me?
> >>
> >> Cheers,
> >> Felipe
> >>
> >> -----Original Message-----
> >> From: Roger Pau Monne
> >> Sent: 08 May 2013 11:45
> >> To: Steven Haigh
> >> Cc: Felipe Franciosi; xen-devel@xxxxxxxxxxxxx
> >> Subject: Re: IO speed limited by size of IO request (for RBD driver)
> >>
> >> On 08/05/13 12:32, Steven Haigh wrote:
> >>> On 8/05/2013 6:33 PM, Roger Pau Monné wrote:
> >>>> On 08/05/13 10:20, Steven Haigh wrote:
> >>>>> On 30/04/2013 8:07 PM, Felipe Franciosi wrote:
> >>>>>> I noticed you copied your results from "dd", but I didn't see any
> >>>>>> conclusions drawn from experiment.
> >>>>>>
> >>>>>> Did I understand it wrong or now you have comparable performance on
> >>>>>> dom0 and domU when using DIRECT?
> >>>>>>
> >>>>>> domU:
> >>>>>> # dd if=/dev/zero of=output.zero bs=1M count=2048 oflag=direct
> >>>>>> 2048+0 records in
> >>>>>> 2048+0 records out
> >>>>>> 2147483648 bytes (2.1 GB) copied, 25.4705 s, 84.3 MB/s
> >>>>>>
> >>>>>> dom0:
> >>>>>> # dd if=/dev/zero of=output.zero bs=1M count=2048 oflag=direct
> >>>>>> 2048+0 records in
> >>>>>> 2048+0 records out
> >>>>>> 2147483648 bytes (2.1 GB) copied, 24.8914 s, 86.3 MB/s
> >>>>>>
> >>>>>>
> >>>>>> I think that if the performance differs when NOT using DIRECT, the
> >>>>>> issue must be related to the way your guest is flushing the cache.
> >>>>>> This must be generating a workload that doesn't perform well on Xen's
> >>>>>> PV protocol.
> >>>>>
> >>>>> Just wondering if there is any further input on this... While DIRECT
> >>>>> writes are as good as can be expected, NON-DIRECT writes in certain
> >>>>> cases (specifically with a mdadm raid in the Dom0) are affected by
> >>>>> about a 50% loss in throughput...
> >>>>>
> >>>>> The hard part is that this is the default mode of writing!
> >>>>
> >>>> As another test with indirect descriptors, could you change
> >>>> xen_blkif_max_segments in xen-blkfront.c to 128 (it is 32 by
> >>>> default), recompile the DomU kernel and see if that helps?
> >>>
> >>> Ok, here we go.... compiled as 3.8.0-2 with the above change. 3.8.0-2
> >>> is running on both the Dom0 and DomU.
> >>>
> >>> # dd if=/dev/zero of=output.zero bs=1M count=2048
> >>> 2048+0 records in
> >>> 2048+0 records out
> >>> 2147483648 bytes (2.1 GB) copied, 22.1703 s, 96.9 MB/s
> >>>
> >>> avg-cpu: %user %nice %system %iowait %steal %idle
> >>> 0.34 0.00 17.10 0.00 0.23 82.33
> >>>
> >>> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
> >>> avgrq-sz avgqu-sz await svctm %util
> >>> sdd 980.97 11936.47 53.11 429.78 4.00 48.77
> >>> 223.81 12.75 26.10 2.11 101.79
> >>> sdc 872.71 11957.87 45.98 435.67 3.55 49.30
> >>> 224.71 13.77 28.43 2.11 101.49
> >>> sde 949.26 11981.88 51.30 429.33 3.91 48.90
> >>> 225.03 21.29 43.91 2.27 109.08
> >>> sdf 915.52 11968.52 48.58 428.88 3.73 48.92
> >>> 225.84 21.44 44.68 2.27 108.56
> >>> md2 0.00 0.00 0.00 1155.61 0.00 97.51
> >>> 172.80 0.00 0.00 0.00 0.00
> >>>
> >>> # dd if=/dev/zero of=output.zero bs=1M count=2048 oflag=direct
> >>> 2048+0 records in
> >>> 2048+0 records out
> >>> 2147483648 bytes (2.1 GB) copied, 25.3708 s, 84.6 MB/s
> >>>
> >>> avg-cpu: %user %nice %system %iowait %steal %idle
> >>> 0.11 0.00 13.92 0.00 0.22 85.75
> >>>
> >>> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
> >>> avgrq-sz avgqu-sz await svctm %util
> >>> sdd 0.00 13986.08 0.00 263.20 0.00 55.76
> >>> 433.87 0.43 1.63 1.07 28.27
> >>> sdc 202.10 13741.55 6.52 256.57 0.81 54.77
> >>> 432.65 0.50 1.88 1.25 32.78
> >>> sde 47.96 11437.57 1.55 261.77 0.19 45.79
> >>> 357.63 0.80 3.02 1.85 48.60
> >>> sdf 2233.37 11756.13 71.93 191.38 8.99 46.80
> >>> 433.90 1.49 5.66 3.27 86.15
> >>> md2 0.00 0.00 0.00 731.93 0.00 91.49
> >>> 256.00 0.00 0.00 0.00 0.00
> >>>
> >>> Now this is pretty much exactly what I would expect the system to do....
> >>> ~96MB/sec buffered, and 85MB/sec direct.
> >>
> >> I'm sorry to be such a PITA, but could you also try with 64? If we have to
> >> increase the maximum number of indirect descriptors I would like to set it
> >> to the lowest value that provides good performance to prevent using too
> >> much memory.
> >>
> >>> So - it turns out that xen_blkif_max_segments at 32 is a killer in the
> >>> DomU. Now it makes me wonder what we can do about this in kernels that
> >>> don't have your series of patches against it? And also about the
> >>> backend stuff in 3.8.x etc?
> >>
> >> There isn't much we can do regarding kernels without indirect descriptors,
> >> there's no easy way to increase the number of segments in a request.
> >>
> >>
> >> _______________________________________________
> >> Xen-devel mailing list
> >> Xen-devel@xxxxxxxxxxxxx
> >> http://lists.xen.org/xen-devel
> >>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxx
> http://lists.xen.org/xen-devel
>
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |