Xen project Mailing List

Re: [MirageOS-devel] Profiling Mirage on Xen

From: Thomas Leonard <talex5@xxxxxxxxx>

Date: Wed, 3 Sep 2014 17:03:24 +0100

Cc: "mirageos-devel@xxxxxxxxxxxxxxxxxxxx" <mirageos-devel@xxxxxxxxxxxxxxxxxxxx>

Delivery-date: Wed, 03 Sep 2014 16:03:36 +0000

List-id: Developer list for MirageOS <mirageos-devel.lists.xenproject.org>

On 2 September 2014 15:10, Thomas Leonard <talex5@xxxxxxxxx> wrote: > On 21 August 2014 18:30, Dave Scott <Dave.Scott@xxxxxxxxxx> wrote: >> Hi, >> >> On 21 Aug 2014, at 12:10, Thomas Leonard <talex5@xxxxxxxxx> wrote: > [...] >>> I've written up the profiling I've done so far here: >>> >>> http://roscidus.com/blog/blog/2014/08/15/optimising-the-unikernel/ >>> >>> The graphs are quite interesting - if people familiar with the code >>> could explain what's going on (especially with the block device) that >>> would be great! >> >> The graphs are interesting! >> >> IIRC the grant unmap operation is very expensive since it involves a TLB >> shootdown. This adds a large amount (relatively, compared to modern >> flash-based disks) to the request/response latency. I think this is why the >> performance is so terrible with one request at a time. I suspect that the >> batching youâre seeing with two requests is an artefact of the backend, >> which is probably trying to unmap both grant references at once for >> efficiency. When I wrote a user-space block backend batching the unmaps made >> a massive difference. > > I wonder if this applies to ARM. You should be able to invalidate > individual TLB entries there, I think. > >> There is a block protocol extension called âpersistent grantsâ that we >> havenât implemented (yet). This does the obvious thing and pre-shares a set >> of pages. We might suffer a bit because of extra copies (i.e. we might have >> to copy into the pre-shared pages) but we would save the unmap overhead, so >> it might be worth it. > > Had a go at this, but it didn't make much difference. However, I've > discovered that dd in dom0 isn't too fast either. I originally tested > with hdparm, which reports 20 MB/s as expected: > > $ hdparm -t /dev/mmcblk0 > Timing buffered disk reads: 62 MB in 3.07 seconds = 20.21 MB/sec > > dd's speed seems to depend a lot on the block size. Using > 4096*11=45056 bytes (which I assume is what dom0 would do in response > to a guest request), I get 16.9 MB/s: > > $ dd iflag=direct if=/dev/vg0/bench of=/dev/null bs=45056 count=1000 > 1000+0 records in > 1000+0 records out > 45056000 bytes (45 MB) copied, 2.65911 s, 16.9 MB/s > > bs=65536 gives 18.8 MB/s and bs=131072 gives 20.8 MB/s. Linux domU > reports 20.36 MB/sec from hdparm but only 18.6 MB/s from dd > (bs=131072). So perhaps Mirage is doing pretty well already. I had a look at how hdparm gets the full speed. It's using 256 pages per request, which requires support for indirect pages in blkfront (with direct requests, the maximum is 11 pages per request). I added support here: https://github.com/talex5/mirage-block-xen/commits/master With this, I got 21.32 MB/s read and 9.17 MB/s write. The results vary a fair bit with block size (those were the best), but that seems like an improvement (the previous best was 18.27r and 7.07w). -- Dr Thomas Leonard http://0install.net/ GPG: 9242 9807 C985 3C07 44A6 8B9A AE07 8280 59A5 3CC1 GPG: DA98 25AE CAD0 8975 7CDA BD8E 0713 3F96 CA74 D8BA _______________________________________________ MirageOS-devel mailing list MirageOS-devel@xxxxxxxxxxxxxxxxxxxx http://lists.xenproject.org/cgi-bin/mailman/listinfo/mirageos-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.