Xen project Mailing List

At 2015-02-27 18:59:52, "Wei Liu" <wei.liu2@xxxxxxxxxx> wrote:
>Cc'ing David (XenServer kernel maintainer)
>
>On Fri, Feb 27, 2015 at 05:21:11PM +0800, openlui wrote:
>> >On Mon, Dec 08, 2014 at 01:08:18PM +0000, Zhangleiqiang (Trump) wrote:
>> >> > On Mon, Dec 08, 2014 at 06:44:26AM +0000, Zhangleiqiang (Trump) wrote:
>> >> > > > On Fri, Dec 05, 2014 at 01:17:16AM +0000, Zhangleiqiang (Trump) wrote:
>> >> > > > [...]
>> >> > > > > > I think that's expected, because guest RX data path still 
>> >> > > > > > uses grant_copy while guest TX uses grant_map to do zero-copy transmit.
>> >> > > > >
>> >> > > > > As far as I know, there are three main grant-related 
>> >> > > > > operations used in split
>> >> > > > device model: grant mapping, grant transfer and grant copy.
>> >> > > > > Grant transfer has not used now, and grant mapping and grant 
>> >> > > > > transfer both
>> >> > > > involve "TLB" refresh work for hypervisor, am I right?  Or only 
>> >> > > > grant transfer has this overhead?
>> >> > > >
>> >> > > > Transfer is not used so I can't tell. Grant unmap causes TLB flush.
>> >> > > >
>> >> > > > I saw in an email the other day XenServer folks has some planned 
>> >> > > > improvement to avoid TLB flush in Xen to upstream in 4.6 window. 
>> >> > > > I can't speak for sure it will get upstreamed as I don't work on that.
>> >> > > >
>> >> > > > > Does grant copy surely has more overhead than grant mapping?
>> >> > > > >
>> >> > > >
>> >> > > > At the very least the zero-copy TX path is faster than previous copying path.
>> >> > > >
>> >> > > > But speaking of the micro operation I'm not sure.
>> >> > > >
>> >> > > > There was once persistent map prototype netback / netfront that 
>> >> > > > establishes a memory pool between FE and BE then use memcpy to 
>> >> > > > copy data. Unfortunately that prototype was not done right so 
>> >> > > > the result was not
>> >> > good.
>> >> > >
>> >> > > The newest mail about persistent grant I can find is sent from 16 
>> >> > > Nov
>> >> > > 2012
>> >> > > (http://lists.xen.org/archives/html/xen-devel/2012-11/msg00832.html).
>> >> > > Why is it not done right and not merged into upstream?
>> >> > 
>> >> > AFAICT there's one more memcpy than necessary, i.e. frontend memcpy 
>> >> > data into the pool then backend memcpy data out of the pool, when 
>> >> > backend should be able to use the page in pool directly.
>> >> 
>> >> Memcpy should cheaper than grant_copy because the former needs not the 
>> >> "hypercall" which will cause "VM Exit" to "XEN Hypervisor", am I 
>> >> right? For RX path, using memcpy based on persistent grant table may 
>> >> have higher performance than using grant copy now.
>> >
>> >In theory yes. Unfortunately nobody has benchmarked that properly.
>
>> I have some testing for RX performance using persistent grant method
>> and upstream method (3.17.4 branch), the results show that persistent
>> grant method does have higher performance than upstream method (from
>> 3.5Gbps to about 6Gbps). And I find that persistent grant mechanism
>> has already used in blkfrong/blkback, I am wondering why there are no
>> efforts to replace the grant copy by persistent grant now, at least in
>> RX path. Are there other disadvantages in persistent grant method
>> which stop we use it? 
>> 
>
>I've seen numbers better than 6Gbps. See upstream changeset
>1650d5455bd2dc6b5ee134bd6fc1a3236c266b5b.

Thanks, Wei. 
The throughout I mentioned (3.5Gbps and 6Gbps) is for UDP 1400 bytes packet, I think the result based on 1650d5455bd2dc6b5ee134bd6fc1a3236c266b5b is for TCP. 

>Persistent grant is not silver bullet. There is email thread on the
>list discussing whether it should be removed in block driver.

I have tried to look for the thread but no detailed info. Could you give me some keyword to find the thread, thanks.

>XenServer folks have been working on improving network performance. It's
>my understanding that they choose different routes than persistent
>grant. David might have more insight.
>Wei.
>
>> PS. I used pkt-gen to send packet from dom0 to a domU running on
>> another dom0, the CPUs of both dom0 is Intel E5640 2.4GHz, and the two
>> dom0s is connected with a 10GE NIC.
>> 
>
>_______________________________________________
>Xen-devel mailing list
>Xen-devel@xxxxxxxxxxxxx
>http://lists.xen.org/xen-devel
Re: [Xen-devel] Poor network performance between DomU with multiqueue support