[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Poor network performance between DomU with multiqueue support



> On Mon, Dec 08, 2014 at 06:44:26AM +0000, Zhangleiqiang (Trump) wrote:
> > > On Fri, Dec 05, 2014 at 01:17:16AM +0000, Zhangleiqiang (Trump) wrote:
> > > [...]
> > > > > I think that's expected, because guest RX data path still uses
> > > > > grant_copy while guest TX uses grant_map to do zero-copy transmit.
> > > >
> > > > As far as I know, there are three main grant-related operations
> > > > used in split
> > > device model: grant mapping, grant transfer and grant copy.
> > > > Grant transfer has not used now, and grant mapping and grant
> > > > transfer both
> > > involve "TLB" refresh work for hypervisor, am I right?  Or only
> > > grant transfer has this overhead?
> > >
> > > Transfer is not used so I can't tell. Grant unmap causes TLB flush.
> > >
> > > I saw in an email the other day XenServer folks has some planned
> > > improvement to avoid TLB flush in Xen to upstream in 4.6 window. I
> > > can't speak for sure it will get upstreamed as I don't work on that.
> > >
> > > > Does grant copy surely has more overhead than grant mapping?
> > > >
> > >
> > > At the very least the zero-copy TX path is faster than previous copying 
> > > path.
> > >
> > > But speaking of the micro operation I'm not sure.
> > >
> > > There was once persistent map prototype netback / netfront that
> > > establishes a memory pool between FE and BE then use memcpy to copy
> > > data. Unfortunately that prototype was not done right so the result was 
> > > not
> good.
> >
> > The newest mail about persistent grant I can find is sent from 16 Nov
> > 2012
> > (http://lists.xen.org/archives/html/xen-devel/2012-11/msg00832.html).
> > Why is it not done right and not merged into upstream?
> 
> AFAICT there's one more memcpy than necessary, i.e. frontend memcpy data
> into the pool then backend memcpy data out of the pool, when backend should
> be able to use the page in pool directly.

Memcpy should cheaper than grant_copy because the former needs not the 
"hypercall" which will cause "VM Exit" to "XEN Hypervisor", am I right? For RX 
path, using memcpy based on persistent grant table may have higher performance 
than using grant copy now.

I have seen "move grant copy to guest" and "Fix grant copy alignment problem" 
as optimization methods used in "NetChannel2" 
(http://www-archive.xenproject.org/files/xensummit_fall07/16_JoseRenatoSantos.pdf).
 Unfortunately, NetChannel2 seems not be supported from 2.6.32. Do you know 
them and are them be helpful for RX path optimization under current upstream 
implementation?

By the way, after rethinking the testing results for multi-queue pv (kernel 
3.17.4+XEN 4.4) implementation, I find that when using four queues for 
netback/netfront, there will be about 3 netback process running with high CPU 
usage on receive Dom0 (about 85% usage per process running on one CPU core), 
and the aggregate throughout is only about 5Gbps. I doubt that there may be 
some bug or pitfall in current multi-queue implementation, because for 5Gbps 
throughout, occurring about all of 3 CPU core for packet receiving is somehow 
abnormal.

> >
> > And I also search for virtio support in XEN, and I find that the one
> > who are familiar with it is you, too,
> > (http://wiki.xen.org/wiki/Virtio_On_Xen), :-). I am wondering what is
> > the current state for virtio on XEN?
> 
> Yes, it was me. I never have the time to revisit that. I don't think we 
> support
> virtio network at the moment.
> 
> Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.