Xen project Mailing List

Re: [win-pv-devel] Windows on Xen bad IO performance

To: 'Jakub Kulesza' <jakkul@xxxxxxxxx>

From: Paul Durrant <Paul.Durrant@xxxxxxxxxx>

Date: Fri, 28 Sep 2018 12:00:20 +0000

Accept-language: en-GB, en-US

Cc: "win-pv-devel@xxxxxxxxxxxxxxxxxxxx" <win-pv-devel@xxxxxxxxxxxxxxxxxxxx>

Delivery-date: Fri, 28 Sep 2018 12:00:26 +0000

List-id: Developer list for the Windows PV Drivers subproject <win-pv-devel.lists.xenproject.org>

Thread-index: AQHUKBddpgP4V3zgh064pZm0uiacGqSo9TUA///zOQCAACyqQIBb1iqAgADJWiCAAA98AIAALHuA

Thread-topic: [win-pv-devel] Windows on Xen bad IO performance

> -----Original Message----- > From: Jakub Kulesza [mailto:jakkul@xxxxxxxxx] > Sent: 28 September 2018 12:04 > To: Paul Durrant <Paul.Durrant@xxxxxxxxxx> > Cc: win-pv-devel@xxxxxxxxxxxxxxxxxxxx > Subject: Re: [win-pv-devel] Windows on Xen bad IO performance > > pt., 28 wrz 2018 o 10:46 Paul Durrant <Paul.Durrant@xxxxxxxxxx> > napisał(a): > [cut] > > Thanks for the very detailed analysis! > > > > Actually 8.2.1 are the latest signed drivers. > > Retesting this again on the same testbed. Results are exactly the same > as in case of 8.2.0. > > [cut] > > > > I notice from your QEMU log that you are suffering grant table > exhaustion. See line 142 onwards. This will *severly* affect the > performance so I suggest you expand your grant table. You'll still see the > buffer reaping, but the perf. should be better. > > > > I have compared gnttab_max_frames 32 and 128. Results: > > == pv drivers 8.2.1, gnttab_max_frames=32 (debian 9 default, same > testbed as last tests) > Atto results: https://imgur.com/gallery/ElSwBqM > responsiveness: a tad better than 8.2.0, and the big package graph > shows this. IO saturation and dead IO graphs are still there. It's > better and by a margin more responsive than 8.2.0. Responsiveness > recovers instantly after Atto is done. Still bad, but better. > After atto is done, Xen's VNC has lost it's mouse. Keyboard works. Funny. > XENVBD|__BufferReaperThread:Reaping Buffers is there in the logs > > == pv drivers 8.2.1, gnttab_max_frames=128 (same testbed as last tests) > Atto results: https://imgur.com/gallery/7x8k2RS > responsiveness: Up to atto transfer sizes of 12MB, cannot say if it's > different. IO saturation and dead IO graphs are still there. When it > started testing 16MB read, suddenly everything got unblocked like > magic. I need to do more testing. This looks unreal. > After atto is done, mouse did not get lost :) > > XENVBD|__BufferReaperThread:Reaping Buffers (2305 > 32) is there in the > logs. At 16MB I suspect things suddenly became aligned and so all the bouncing stopped. This all the log spam ceased and things got a lot more stable. > > # xl dmesg | grep mem | head -n 1 > (XEN) Command line: placeholder dom0_mem=4096M gnttab_max_frames=128 > > I would say that in case of Atto (that is REALLY IO heavy) there is > very marginal impact. On the other hand I see that SQL Server > workloads benefit from changing gnttab_max_frames. > > Side note, what does this actually mean: > 2679@1538131510.689960:xen_platform_log xen platform: > XENBUS|GnttabExpand: added references [00003a00 - 00003bff] > 2679@1538131512.359271:xen_platform_log xen platform: > XENBUS|RangeSetPop: fail1 (c000009a) > Logically these messages should be read the other way round (I expect there was another GnttabExpand after that RangeSetPop). When a new grant table page is added (by GnttabExpand) a new set of refs (in this case from 3a00 to 3bff) becomes available. These are added into the XENBUS_RANGE_SET used by the XENBUS_GNTTAB code. When something wants to allocate a ref then RangeSetPop is called to get an available ref. When that call fails it means the range set is empty and so a new page needs to be added, so GnttabExpand is called again to do that. > > [cut] > > > XENVBD|__BufferReaperThread:Reaping Buffers (966 > 32) > > > > > > Reaping buffers does not happen with the latest drivers. > > > > > > > The fact that you are clearly seeing a lot of buffer is interesting in > itself. The buffer code is there to provide memory for bouncing SRBs when > the storage stack fails to honour the minimum 512 byte sector alignment > needed by the blkif protocol. These messages indicate that atto is not > honouring that alignment. > > Maybe Atto is not, but so is MS SQL. This is visible when testing with > Atto on both 8.2.1 and 8.2.0, not visible on 9.0-dev-20180927. The > 9.0-dev is getting lower results with smaller packet sizes, but stable > and working across the Atto test. > > > > > > == questions: > > > > > > * so you guys must have done something in the right direction since > > > 8.2.0. BRAVO. > > > > The master branch has a lot of re-work and the buffering code is one > of the places that was modified. It now uses a XENBUS_CACHE to acquire > bounce buffers and these caches do not reap in the same way. The cache > code uses a slab allocator and this simply frees slabs when all the > contained objects become unreferenced. The bounce objects are quite small > and thus, with enough alloc/free interleaving, it's probably quite likely > that the cache will remain hot so little slab freeing or allocation will > actually be happening so the bounce buffer allocation and freeing overhead > will be very small. > > Also the master branch should default to a single (or maybe 2?) page > ring, even if the backend can do 16 whereas all the 8.2.X drivers will use > all 16 pages (which is why you need a heap more grant entries). > > > > can this be tweaked somehow on current 8.2.X drivers? to get a single > page ring? max_ring_page_order on xen_blkback in dom0? Yes, tweaking the mod param in blkback will do the trick. > > > > * what is the expected write and read speed on a harware that can > > > deliver (measured with dd) reads at about 77MB/s, and writes 58MB/s. > > > * do you guys plan to improve something more? How can I help to test > > > and debug it? > > > * when are you planning to have a next signed release? > > > > All the real improvements are all in master (not even in the as-yet- > unsigned 8.2.2), so maybe we're nearing the point where a 9.0.0 release > makes sense. This means we need to start doing fill logo kit runs on all > the drivers to shake out any weird bugs or compatibility problems, which > takes quite a bit of effort so I'm not sure how soon we'll get to that. > Hopefully within a few months though. > > You could try setting up a logo kit yourself and try testing XENVBD to > see if it passes... that would be useful knowledge. > > seems fun. Where can I read on how to set up the logo kit? > See https://docs.microsoft.com/en-us/windows-hardware/test/hlk/windows-hardware-lab-kit > Is there an acceptance testplan that should be run? > I've not use the kit in a while but I believe it should automatically select all the tests relevant to the driver you elect to test (which is XENVBD in this case). > Is there a list of issues that you'll want to get fixed for 9.0? Is > Citrix interested right now in getting Windows VMs of their customers > running better :)? Indeed Citrix should be interested, but testing and updating the branded drivers has to be prioritized against other things. Whether Citrix wants to update branded drivers does not stop me signing and releasing the Xen Project drivers though... it just means they won't get as much testing, so I'd rather wait... but only if it doesn't take too long. > Testing windows VMs on VMware the same way (with > VMware's paravirtual IO) is not stellar anyway, looks crap when you > compare it to virtio on KVM. And 9.0-dev I'd say would be on par with > the big competitor. > > Funny story, I've tried getting virtio qemu devices running within a > XEN VM, but this is not stable enough. I have managed to get the > device show up in Windows, didn't manage to put a filesystem on it > under windows. > A lot of virtio's performance comes from the fact that KVM is a type-2 and so the backend always has full privilege over the frontend. This means that QEMU is set up in such a way that it has all of guest memory mapped all the time. Thus virtio has much less overhead, as it does not have to care about things like grant tables. Cheers, Paul > > > > > * how come Atto in a domU is getting better reads and writes than > > > hardware for some packet sizes? Wouldn't it be wise to disable these > > > caches and allow linux in dom0 (and it's kernel) to handle I/O of all > > > VMs? > > > > > > > We have no caching internally in XENVBD. The use of the XENBUS_CACHE > objects is merely for bouncing so any real caching of data will be going > on in the Windows storage stack, over which we don't have much control, or > in your dom0 kernel. > > ACK. > > > [cut] > > > -- > Pozdrawiam > Jakub Kulesza _______________________________________________ win-pv-devel mailing list win-pv-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/win-pv-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.