[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Odd blkdev throughput results
> > The big thing is that on network RX it is currently dom0 that does the > > copy. In the CMP case this leaves the data in the shared cache ready to > > be accessed by the guest. In the SMP case it doesn't help at all. In > > netchannel2 we're moving the copy to the guest CPU, and trying to > > eliminate it with smart hardware. > > > > Block IO doesn't require a copy at all. > > Well, not in blkback by itself, but certainly from the in-memory disk > image. Unless I misunderstoode Keirs post recently, page flipping is > basically dead code, so I thought the number should at least point into > roughly the same directions. Blkback has always DMA-ed directly into guest memory when reading data from the disk drive (normal usecase), in which case there's no copy - I think that was Ian's point. In contrast the Netback driver has to do a copy in the normal case. If you're using a ramdisk then there must be a copy somewhere, although I'm not sure exactly where it happens! Cheers, Mark > > > This is not my question. What strikes me is that for the blkdev > > > interface, the CMP setup is 13% *slower* than SMP, at 661.99 MB/s. > > > > > > Now, any ideas? I'm mildly familiar with both netback and blkback, and > > > I'd never expected something like that. Any hint appreciated. > > > > How stable are your results with hdparm? I've never really trusted it as > > a benchmarking tool. > > So far, all the experiments I've done look fairly reasonable. Standard > deviance is low, and since I've been tracing netback reads I'm fairly > confident that the volume wasn't been left in domU memory somewhere. > > I'm not so much interested in bio or physical disk performance, but > relative performance of how much can be squeezed through the buffer ring > before and after applying some changes. It's hardly a physical disk > benchmark, but it's simple and for the purpose given it seems okay. > > > The ramdisk isn't going to be able to DMA data into the domU's buffer on > > a read, so it will have to copy it. > > Right... > > > The hdparm running in domU probably > > doesn't actually look at any of the data it requests, so it stays local > > to the dom0 CPU's cache (unlike a real app). > > hdparm performs sequential 2MB-read()s over a 3s period. It's not > calling the block layer directly or something. That'll certainly hit > domU caches? > > > Doing all that copying > > in dom0 is going to beat up the domU in the shared cache in the CMP > > case, but won't effect it as much in the SMP case. > > Well, I could live with blaming L2 footprint. Just wanted to hear if > someone has different explanations. And I would expect similar results > on net RX then, but I may be mistaken. > > Furthermore, I need to apologize because I failed to use netperf > correctly and managed to report the TX path on my original post :P. The > real numbers are rather 885.43 (SMP) vs. 1295.46 (CMP), but the > difference compared to blk reads as such stays the same. > > regards, > daniel -- Push Me Pull You - Distributed SCM tool (http://www.cl.cam.ac.uk/~maw48/pmpu/) _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |